JOURNAL F djs АМЕЛЁ 
БАРЫНАН А SOCIATI 


^ = © aN 4 3 EAE 
say Ii К Г fi E- ; 
2 * | 
` ә LJ А 
Р ж n 
е e 
VOLUME 49: 194 | 
NUMBERS 265-268 À is 
* i i E 
е 4 , 5 E] К ў | 
а " КИ: үс le 
e c 
Published Quarierly’by the 
AMERICAN STATISTICAL ASSOCIATIONS 
e WASHINGTON, D. C. 
“= go 
5 с 1954 
се T Cx 
p a aS 


D 


© ЕК p € 
Jn КОО Olt THE AMERICAN 
- STATISTICAL ASSOCIATION 


А The Edi elcome the submission of manuscripts for Saale. publication. 
They Pag typewritten entirely шиа, including footnotes, "апі 

=-= two copies sh: ld \be sent?to the Editor, W. Allen Wallis, 207 Haskell Hall, 
University of Chicago, Chicago 37. Books for review should be sent to the айе, 
address, Unsolicited book reviews are not accepted, but suggestions of titles for 
review are welcome. Ee. 


3 » 


EPITOR * 


j W. Auten WaLuis, University of Chicago 
э ASSISTANT х9 THE EDITOR: om А. LABADIE 


ае EDITORS 


Нлво1р А. FREEMAN’ Pair J. McCanTHY 
Massachusetts Institute of Tech. Cornell University | 
„бковвв М. Kuznets I. RICHARD Savagp . 
> University of California > National Bureau of Standards 
» » C. ASHLEY WRIGHT 
> Standard Oil ND Ñ. 7. ) 


ADVISORY PANEL OF FORMER EDITORS 


ҮпллАм С. Cocuray (1943250) ” Prank A. Ress (1926-34, 41-45) 
Johns Hopkins University Thetford, Vermont > 

Wiit1am F. Освовм (1920-25) ,FnEDERICKF.STEPHAN(1085-40) . 
University of Chicago < Prinéeton Unigersity ә i 


DAE et . 


erie 


> Errata: Readers and authors are urged to submit to the Editor notices of 


a 5 errors, found in this gr any previous ПШ; These will be сосы once 
a year, in the December issue. 


е 2 a э? *% 


t l { $ 
D 
` EDITORIAL COLLABORATORS ; =; 


тас A. ARorAN Јоферн M. CAMERON 
* Hughes*Aircraft Сдтрату » е. National Bureaw of Standards | 
KENNETH J. ARROW n A povaras G. CHAPMAN Я 
Stanford University University of Washington 


GARDNER ÀCKGEY *. FELIX CHAYES à 
University of Michigan t Carnegie Institute of Washington. 

Forman S. ACTOR, * e HERMAN @HERNOFF 

» * Princeton University Stanford University 

ARMEN A. ALCHIANM CaRL CHRIST 

Upiversityof California (Los Angeles) Johns Hopkins University 
| STEPHEN D. ALLEN — * ANSLEY J. Сол 

University bf Minnesota Princeton University 

ОЕ. ANUEESON e oA. C. Єонем, JR 
North Caroléna State College > University of Georgia 


FRED C. ANDREWS 
iversii WinLiAM 8. Connor 
University of Nebraska Я National Bureau of Standards 


KENNETH J. ARNOLD C ë 
Michigan State College . Warren N. aF 
M. 8. BARTLETT 3 A. C. Nielsen Company 
University of Manchester B in .Com nii 
EanL F. BEACH ө - 2 Сота University 
McGill University Duper J. Соурех я 
Ковент BECHHOFER University of North Carolina 
Cornell University Срсп, C. Craig 
- Martin BERGER University of Michigan 
National Bureau of Standards Joun H. Curtiss " 
ABRAM BERGSON + New York University « 
Columbia University Јоверн F. Dany 
Јоверн BERKSON * | . Buftau of the Census Me » 
Мао Climie e EvrANOR S. DANIEL 
Max A. BERSHAD The MutualLifg Insurance Company 
Bureau of the Census . of New ee е | 
Bs галаи F. DAVID 
~~ Columbia University к.А) 
E е gt ea 
niver: а: t риск ; 
"Davip ae У ki dg University of Chicago 
Columbia University W. EDWARDS Da . 
oe E f Paes . т К S Arta Bee y 2 
‘omell Universi AUL М, : 
Dow» J. Вой ж University of Pittsburgh 
Miami University Cyrus DERMAN — е 
Paur A. Вовснам * ues Columbia Undversity 
New York City f: WILFRID J. DIXON . 
-ALBERT H. BOWKER © The Rand Corporation 
Stanford University 1: ROBERT DORFMAN 3 
б . Box € i University of California (Berkeleydoona + ^ 
Imperial Chemical Industries, Ltd. Hanorp F. Dorn 
R. A. BRADLEY ae Nafional Institute КАШ е 
Virginia Polytechnic Institute „ Franegs W. DRESCH е 
Ermer C. Ввлтт. t 2 P ато Research Institute i 
Lehigh University =. Н. Duncan Ы 
s neta Chicago "Standard Oil Company (N. J.) à 
Invia W. 5 D. B. DUNCAN 1 E es 
Purdue Una rt е Virginia Polytechnic Institufé Ы 
Grenn L, Burrows 2. CHARLES DUNNETT * 
Bureau of Agrioultusal Economics с Cornell. University ie 
е E Gf 
° rien vwd wow е 
6 „% rà [1 
ОУ e . e 
x^ е oe * P e z 


; A 
iv AMERICAN STATISTICAL ASSCCIATIO V JOURNAL, DECEME™R 1954 


Meyer Dwaus P.C.HawwER —. : 
Northwestern University ) University of Wisconsin 
PAUL 5. DWYER > Morris H. HANSEN 
University of Michigan | " Bureau of the Census 
~ CHURCHILL EISENHART ў Frank HARARY — } 
National Bureau of Standards Y ‘University of Michigan * 
Freperick A. EKEBLAD 4, Ау№оір HARBERGER 
Northwestern University ‘niversity of Chicago 
Horr T. ELDRIDGE Bovp HARSHBARGER » . 
United Nations Virginia Polytechnic Insvitute 
BENJAMIN EPSTEIN Hangy Р. HARTKEMEIER 
Wayne University td University of Missouri | 13 
HELEN C. FARNSWORTH H. O. HARTLEY ^ E 
Stanford University Iowa State College й 
ROBERT FERBER James B. HASSLER 
University of Illinois DR барта (Berkeley) 
Емосн B. FERRELL 3 Minzanp W. Hastay 
Bell Telephone Laboratories * Natiónal Bureau of Economic Re- 
CrARENCE B. FINE search § 
- Bureau of Internal Revenue WALTER А. HENDRICKES 
А. L. FINENER Department of Agriculture 
North Carolina State College CLIJFORD HILDRETH 
Eve yn Fix ) North Carolina State College 
University of California (Berkeley) Jack HIRSHLEIFER 
Jonn К. FOLGER 5 The Rand Corporation 
Southern Regional Education Board WALTER E. HOADLEY, JR. 
RICHARD J. FoorE 1 Armstrong Cork Company 
Bureau of Agricultural Economics J. L. Hopczs, Jn. ` 
Ковквт N. Forp з University of California (Berkeley) 


: American Telephone and Telegraph Wasst.y Нокғғріча 

Y. ROAD URS University of North Carolina 
йен of tania (Bein) Үр Mtem л, 
ALEXANDER GERSCHENKRON Dix O HORUM 


Harvard. University 5 


ТАИ анто A 8 ЕНУ, ‹ of Pittsburgh 
i FE і à AnRY M. HuGHES 
DOE байеаду University of California (Berkeley) 
University of Chi > 7 ЖїпллАм HURWITZ 
(cuisse len Stanford. University 
Massachusetts Institute of Technology ABRAM J. АЕРГЕ 
Witson Н. GRABILL Columbia University A 
Bureau of the Census J53LIUS JAHN "o 
Howanp WHIPPLE GREEN › State College of Washington o 
i она Health Council Енем Е. JARLETT ^ 
RVING I. GRINGORTEN iversii ij 7 
EOM Force Cambridge Research Lab- тиа fmm Em 
oratory 
Ewu J. Gomer diy les 
р U nivereny p Ünieersity of California (Berkeley) 
А . J. JESSEN 


з: 
te Go Towa State College 


К Israel Insticute of Applied Social Re- Hanorp F. Dorn 


ч search A Nciional Institute of Health 
Crausw D. HADLEY * > 5» WILLIAM О. JONES. 
" University of Santa Clara > — Stanford University , 
Mareanret J. Hacoop Norman M. KAPLAN 
5Burequ of Agriculturcl Economics The Rand Corporation 
? Max HALPERIN Leo Zarz A 
, .. National Heart Institute Michigan State College 
. C. Horace HAMILTON Oscar KEMPTHARNE 
Ne orth, Carlina State College › Towa State College 


J 2 


EDITORIAL COLLABORATO! { * y 


Jack KIEFER Harry MagkOowrmz — * 
Cornell. University е The Rand Corporation 
Dayrp KENDALL ANDREW W. MARSHALL с 
niversity of Oxford Hugs: DONE " 
Joun W. KENDRICK " Kp&NETH 
Departntent of Comħerce Carleton College 
Frank L. KIDNER Garnett E. bre * 
University of Calsfornia (Berkeley University of Manitoba 
BRADFORD Е. KrwBASL Quinn McNemar 
New York Public Service Compassion: Stan; one Leva 
Ералв P. Kine. * e PauL 
‚ * Eli Lilly awd Company Күс denen University 
. “Tusurg KISH FREDERICK C. MILLS 
Upiversity v of Michigan Columbia University 
LAWRENCE KLEIN . Guorrrey H; MOORE 
Universi bf Michigan National Bureau of Heonomia Re- 
Ілотр A. KNOWLER search 
University of Towa 9 Norman MORSE 
* Н. 8. KONIJN Cornell University 
University of California (Berkeley) J. E. Morton 
CARL Kossack Cornell University 
Purdue University . Lrncotn Moses 
%їплллм Н. KRUSKAL » Stanford University 
University of Chicago FREDERICK MOSTELLER 
Ernest KURNOW * Un Vc Bee Chicago 
New York University ROBERT J. 
Simon Kuznets Social ‘Security 4 Administration 
University of Pennsylvania Joun NETER 
PAUL LAZARSFELD "Syracuse University 
Columbia University J. NEYMAN 
Ivan M. Lene University of California (Berkeley) . 
University of California Becks) * Скокок E. NICHOLSON, JR. 
Tomas LEHRER Uniwersity of North Carolina * 
Harvard University? HanoLp NISSELSON 
J. M. LeTICHE Bureau of the Census 
University of California (Berkeley) GOTTFRIED NOETHER 9 
GERALD J. LIEBERMAN Boston University 
Stanford Universitys Epwarp B. OLDS 
JULIUS LIEBLEIN > — Sgdal Planning Council of St. Louis 
National Bureau of “Standards EpwiN G, Ops 
RICHARD BINK Carnegie Institute of Темни. 
Sandie Corporation PauL. S. OLMSTEAD 
Henry I>. LOCKE * Beli Telephone Laboratories 
Liberty Mutual I: neurancé Compan) Guy Овсотт y 
Irvine Loros ® Harvard d University 
Columbia University. Guapys PALMER в 
SUDAN LUCE т e à cett of Pennsylvania 
assachusetts Institute of Technolog OHN En ol 
"EUGENE LUKACS d U American Telephone and Telegraph 
Office of Naval Research Company 
Duncan Macintyre е Eveens W. PIKE qon 
Corhell University E Raytheon Manufacturing Gompany 
Grorce F, Mam ** Отто POLLAK P 
Princeton University | e Wi School 
Suerman J. MAISEL DANE Q. PRICE » 
University o; Ren (Berkeley) = University of North Carolina 
Bensamin J. MANDEL FRANK PROSCHAN 
Social Security Administration Sylvania Elegtric Products Corporaiign i 
Henry B. мр. i e ЈоѕерН PUTTER 
Ohio State University University of California (Berkeley) 
NATHAN MANTEL, : Monrox 8. RAFF 
National Institutes sy Health | eU. 8. Bureau of Public.Rouds 
е 1 © 
А et 
e . © t e ү є 
Е е е е 
4 Le EN, e ec 


» 


> 


Aran Ross 


Iowa State College 


J 
vi AMERICAN STATISTICAL ASS cum JOURNAL, DECEM7ER 1954 
J. A. БлсмЕ? ILLIAM A. SPURR. 
North Carolina State College р Stanford University 
STANLEY REITER >  RoBERT B. STEFFES 2 
> . Stanford University ] Bureau of the Budget 3 
Paur Ё. RIDER А JOSEPH STEINBERG К 
Wright-Patterson Air Force Base 3 Bureau of the Census 
ES V. MAS > N ja oh T2 р 
niversity oj icago London 'conomics 
ENRY С. Romie S R. M. боурвом? м 
Pacific Palisades, California University of Rangoon Е 
Murray ROSENBLATT э С. М. Syuonps . а 
University of Chicago Esso Standard Oil Conipony D 


CONRAD TAEUBER Л 
Bureau of the Census ` D 


Lester SARTORIUS . TEICHROEW 

University of Illinois Institute for Numerical Analysis 
FRANKLIN B. SATTERTHWAITE; Mir E. TERRY 

General Electric Company а Bell Telephone Laboratories 
Іжомавр J. SavAGE Donorny 8. Тномлѕ 

University of Chicago nr of Pennsylvania 

ENRY SCHEFFÉ Donovan J. Тномрѕом 

University of California (Berkeley) University of Pittsburgh 
зма b Deparime à У Socigt е. OE ER Сойер 

ew Yor. ment of Soci; el- оша State College 
fe HJ p C. K. Tiao 


fare 
Epwarp E. SCHWARTZ 


Harry SCHWARTZ 


Children’s Bureau 


Wayne University 
Joun W. Tuxer , 
Princeton University 


New York Times Davin VALINSKY 
ELIZABETH L. Scorr = City College of New York 
« University of California (Berkeley) » Davrp VorAw Ы 
Ерулвр 8. Saaw i Yale University 
> — Brookings Institution ә Davi L. ЙАмАбЕ , > 
G. D? SHELLARD University of Chicago 
Metropolitan Life Insurance Com- Joun E. WALSH 
pary Apu › Ncval Ordnance Test Station 
Henry S. Suryock, Jn. Aurrep N. Watson 
Bureau of the Census e > 0, „Wesleyan University Press 


J4con 8. SrzazL, в ^ FnEDERIUK V. Waver 


Bureau of the Census ? 
ROSEDITH SITGREAVES 


в. of Agricultural Economics 


Stanford University University of Leeds 3 
Јону H. 8мттн Rocer I. WILKINSON MS 
S Shae University 8. Bor retentions Laboratories > 

плом Son d EM 

Cornell DD ey я š NS University 
Hurserr SOLOMON Federal Rew Board 

Columbia University › eserve Boar 


Ro: 


= assachusetts Institute of Technology 


BERT M. Sorow 


Max WoopBunr 
University of Pennsylvania 
F. Yarro 


ORTIMER SPIEGELMAN Rothamsted Experiment Д 
Metropdtitan, Life Insurance Com- мааа Er 
pany » National Bureau of Standards 


› Жел УЗА С 


. bps | 
[] 

JOURNAL OF, THE AMERICAN*' 

STATISTICAL ASSOCIATION 


| 


-> e `, * CONTENTS OF VOLUME 49 


-— 


, e 
Tue 1134н NNUAL MEETING 


| e à MINUTES ÒF THE ANNUAL Businfss M£gmNG. . . . . 315 
SuMMARIES or Parnas DELIVERED . . . . ^. . . . 828 
ARTICLES... О M UU NP D TIS 1, 209, 413, 685 
STATISTICAL ABSTRACTS ". . . . . * . .* . 178, 871, 640, 907 


Boox Reviews . 184, 378, 637, 916 


Pusiicatioys RECEIVED SU Ec аот т ООУ ОНИЕ 

R&npom Рюпв*„_ . . -e . . ©. _.*. 206,410, 682, 928, 
. е 

EnBATA О cep ite ten eet ia ene 


INDEX TO Vouumn,49, 1954 е 


е e 
Antictzs, ву:Атенов sc eee tel i eee 
Bgox Reviews, BY Aurgor%’. . . . . - - . + + ‚981 
° е 
lust оғ REYIEWERS , 2. а ee gee i 0084 
Ж e LJ . 


Reports AND Orricran Notices © 


Report or rap Волю» or Directors . . . . - . + 94 
ee [9 
Report or THE SECRETARY-TREASURER . . . *e . . 318 © 
: б -. 
Report or THE AUDWORS .,% $ ta 0. + > + e 819 
е 
e 5 e " СЕА 
^ * Š 
* he Я E EF 
: 
2 E xcd а 
A . . e 


JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


N AS 265° % . MARCH 1954 Volume 49 


. ° 
THE PRESENT STRUCTURE OF THE ASSOCIATION* 


* WILLIAM С. COCHRAN 
The ом» Hopkins University 


IVE years ago, the Pee adopté a new Canstibution which 
was intended to facilitate substantial changes in the nature of the 
Association. Written Constitutions are not noted for their ability to 
grip and hold the reader's interest, and I doubt whether many members 
paid more attention to the new Gonstitution than was necessary in 


degiding how to yote on it in 1948. баш dg I would like to pre- . 


sent some irfipressiens of the experience of the Association during the 
first five years of operation under the new*Constitujiop. I hope that this 


account will give members a better picture ‘of the present nature of the . 


Association and will lead up to several Quostions concerning our future 
development aboutfwhich I wish to entourageemembers to do sore 
P thinking. * е $ 
ayy THE SITUATION ÅS IT APPEARED IN 1945 e, 

Planning for’ a new- Constitution began when the Association was 
able to resume nórmal activities towards the еп of World War П. 
In the early discussions about a suitable future pattern for the Associ- 
ation, the committee at work on the new Constitution took note of 
four developments in the "eld of statistics that seemed relevant. 

1. Statistical techifiques bad penetrated nto a great variety of fields. 
Up till about 30-years або, practical Safistics dealt mainly with 
applications to economics, business and government, and the in- 
terests of the Association's members tended to reflect this fact. It 

ЕЕ 15, 


* Presidential: S adress at the 113th Annual Meeting of the American Statistical Association, Wash- 
ington, D, C., December 28, 1953. б 


PELIS 


с 


2 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARGA 1954 


is easy to exaggerate the extent t» which; this was so: the Associa- 
tion has always welcomed statisticians in any field of knowledge 
> and 80 or 40 years ago the Journal was publishing important 
papers on a wide range of topics. But the organized activities of 
the Association dealt largely wit% applications in the economic 
sphere. In the 30's, however, and still more in the early 40's, the 
increased use of statistical ideas агї techniques in such fields as 
psychology, the various branches of biology, medicine, the sociel 
sciences, industrial research and operations, and marketing was a 
striking phenomenon. 
2. During the same period, persons interested in these оет. develop- 
ments had founded a humbor of new societies, among them the 


Institute of Mathematical Statistics, the Econometric Society, the ` 


Psychometric Society and the American Society for Quality Con- 
trol. All these sogieties were strongly concerned with statistical 
techniques, but none of them had any fürmal relation to the ASA. 


3. The membership vf the ASA was increasing and might be expected | 


-to grow rapidly iñ the post-war years. In 1945 there were about 
3,300 members, at present, there are close to 5,000. 
4. With the formation of the United Nations, some of its agencies 
might be expected to foster new developments in international 
E statistics. n 5 5 
In Considering the future of the Association in the light of these 
factors, two prinsipal choices appeared to be open. The Association 
might continue to give primary attention to applications in economies, 
leaving applications in otherfields to be taken, care of by other societies. 
This would have been a reasonable course of action. Although the As- 
sociation had received an influx of members whose interests, were in 
other fields, the primary concern of over half the members in 1945 was 
still with applications to economics or business, as revealed by the 
1945 Directory. 3 
The second course, the one'actually adopted, was to try to give the 
Association a central role with regard to all fields of application of sta- 
tistics. This decision was advocated by almost all members whose 
opinions were sought. It was a wise decision from many points of view, 
partieularly when no one knéw where important statistical applications 
might turn up next, wheti’statistical activities were being parcelled 


out amongst numerous societies and when a strong national body 1 


might accomplish much in cooperation with international agencies. 
^ We should recognize, however, that the Jecision involveu a real sacri- 
> fice, at least for a time, by the members in economies and business, 
quae 8 relatively homogeneous saciety catering satistactorily to them 
э 


Q4» il = 


a O 


PRESENT STRUCTURE OF das ASSOCIATION 3 


was to be changed into som@hing more amorphous Whose future 
cqurse was harder to predict. These members accepted and encouraged 
the change with excellent spirit and with, as might be expected, occa-* 


-sional geumbles. e ¢ 


. LJ 
, SOME PROVISIONS OF THE 1948 CONSTITUTION 
"The décision having beef taken, the new Constitution was con- 


, structed sọ as to introduce & number of devices that would make the 


desired changes easier to accomplish. I would like to describe the pur- 
podes, as 1 understand them, of some of the principal provisions in 
the 1948 Cbnstitutiop. 

Associated and affiliated societies. One of the most difficult questions 
was: what was to be the relation between the ASA and the other so- 
cieties dealing with some aspect of statistics that had come into being or 
might be established in the future? Much thqnght was given to this 
question, including a study of various, mechanisms that had been 
adopted by other large tentral organizations. Finally, it was decided 
to try two provisions, called association and affiliation. : 


Any other society interested in the objects of the ASA may apply to . 


bécome an Associated or an Affiliated Society. The status of an Associ- 
ated Society is intended for societies whose interest in statistics is 
stgong: that of an Affiliated Spciety wag intended to cover a loosey, 
type of connection, but since this provision was dropped in our’ recent 
minor revision of the Constjtution, I willnot go into detail abput it. 
Proposals for association are examined by our Board of Directors and 
Council before a decision is takensto grat the status. 

Each Associated Society receives the right +в appoint two members 


. to the Council of the ASA, one member to the editorial board of The 


American Statistician, ала ong member to the ASA Committee on 


* Publications fer each periodical which it publishes. The ASA is required 


to offer its publications to the members of Associated Secieties on the 
same basis as to ASA members, and vide versa. * 

The arrangement involves a slight loss of autonomy by the ASA. 
In return, it establishes definite method of liaison, makes our Couns 
cil nfore representajive of statistical interests as & whole, apd puts us 
in a better position to play the kind of cefitral role that fas considered 
desirable. * «Сз ылу ; 

Sections qnd section committees. If the ASA is to be a society whose 
members have a great variety of interests, what can be done to ensure 
that each Sf the principat interest-groups within the membership 
participates to its own satisfaction? : 
For dealing With this problem, tke ASA had a successful pjecedent 

e > 3 3 


в 


. < ^ E 
e . Ж t 
e e > 


e 
e 

E 
wx e oC 


Pte fre ERE M ay VEL 


8 


э 


« 


2 


» 
4 AMERICAN STATISTICAL ASSOCIA'TION JOURNAL, MARCH 1954 


in the Biometrics Section, which had i n in existence for a number of 
years. Although only a small fraction of the membership was interested 

>in biometry as such, this Sec ion/arranged programs at each annual 
meeting, held joint sessions at meetings of a nuraber of tho biologi- 
cal societies and published the Вале Bulletin with financial back- 
ing from the ASA. E 

The 1948 Constitution encouraged the formation of Sectioris in other 
broad areas by providing for the establishment of Section Committees. 
The general function of Section Committees is “to further the develop- 
ment of statistics in fields not adequately covered at present by associ- 
ated or affiliated societies." (Article X, 8). These Cominititees are 
represented on the ASA prdgramscommittze in order to arrange pro- 
grams in their individual areas. In course of time, a Section Committee 
may draw up a charter which on approval leads to the formation of & 
Section. The new Constitution looks still further ahead by providing 
that when a Section has grown large enough, the Section Committee 
may take the initiative in ofganizing an Associated Society. 

Districts and Districe Committees. In nation-wide societies that are 
small, meetings tend to be on a.national level, As the society grows in 
numbers, it becomes feasible to hold regional meetings which give more 
of the members a chance to participate. In the ASA we have been 
fortunate in having long tradition of meetings both at the national 
level dnd through our Chapters at the local level. In order to encourage 
activifies and mestings at 3n'intermediate regional level, the Consti- 
tution provides for the setting-up of geographical districts. In each, 


there is a District Committee, with two members from each ASA 


5, 


chapter and from eack local unit, if there are any, of any Associated or 


Affiljated Society. The District Committees thus provide a means for . 
coordinating the activities of the ASA nnd related societies at both the 


the local and regional levels. » 

Council. Fiaally, in order to give the membership a broader repre- 
sentation in the administration of the ASA, the Constitution created 
а new policy-making body, the Council. This consists of the Board of 

Djrectors, the editor of each ASA publication, two representatives 
from each district and one from each Section Committee with moré than 
75 members) hs well as representatives of Associated Societies and an 
equal number of represeritatives-at-large. The “Board of Directors, 
which in former times was the governing body, now serves as the 


executive committee of the Council. During 1953, the Council had 34 : 


> members, as compared with 13 on the Beard. 


| 
| 


4 


| 
| 
: 
E 
1 
: 
i 


PRESENT STRUCTURE OF VHE ASSOCIATION 5 
* 


WE ASSOCIATION’S EXPERIENCE UNDER THE 1948 CONSTITUTION 

|1 would now like to describe how the new devices have operated 
during the past 5 years. In cases where things have not as yet workede © 
quite ag actively as was = I фо not want to give the impression 
of washing dirty linen in, pub, whigh would be most reprehensible for 
a President, Му defense would be that this linen is not dirty, and it is 
not being washed; but merelf aired. 
. Associated Societies. Up t6 the present time, only one organization 
* has becóme linked to us through this provision—the East North Amer- 
ican Regidh of the Biometric Society, which might be regarded as one 
of our ownéchildren ‘own up, since the Biometric Society is a natural 
outgrowth of our Biometries Section, . 

This modest beginning is not surprising, because no strenuous efforts 
have been made to bripg the provision to the attention of other socie- 
ties. In my opinion, it is advisable to wait until the ASA has settled 
down under the new Constitution before exploring with some of the 
other societies the possibility of a closer relationship, although we have 
progressed far enough so that any good opportunity for initiating dis- 
cussions should hot be missed. Perhaps the most propitious times will 
be when cooperation has already arisen about some matter of mutual 
interest, orewhen a new society hgs been launched with the guidance 
of, the ASA. With the older societies, we may glso have to recognize 
and handle tactfully a problem of prestige. Some members of these 
societies may feel that Association implies in sope,way а recognition 
of a lower status. No such Status was intended in framing these pro- 
visions, under whith the ASA sacrifice seme autonomy, but the other 
society does not, 9% is clearly stated in Sur Constitution. (2 

Sections and Sectibn Committees. Excellent progress has been made in 
establishing a well-rounded greup of Sections. This year, the Section 
+ on Social Statistics has” beén added to those on Biometrics, Business 
and Economic Statistics and Training in Statistics. A Committee on 
Statistics in the Physical Scifnces has been at work for 2 years. Jointly 
these 5 areas appear comprehensivesenough to cover the major inter- 
ests of practically all our members, at least for the time being. Perhaps 
the largest single group unrepresented by a Section are the members. 
whose primary intefest is in statistical theory. So long ав &Һ& Institute 
of Mathematical Statistics @ontinues to Meet with us, as it has done 
consistently in the past, such members are unlikely to regard them- 
selves as néglected. In arranging the large number of sessions (cur- 
rently arougd 50) which new comprise the program at the annile 


e . © 


в . 
e е. е 
e 
р, 


) 


6 AMERICAN STATISTICAL ASSOCIA'ZION JOURNAL, MARCH 1954 
у е 


meeting, the'Section representatives have worked most efficiently and 
amicably, and I believe that we have a smooth mechanism for ag- 
” scomplisbing this complicated task The Section Committees have also 
been active in varying degrees 1 in gther projects, ang have been. called 
upon on numerous occasions for advice oy the Board and Council. 
Districts and District Committees. Activity in arranging meetings of 
something approaching a regional character, which was oae of the 
primary intentions in setting up districts, has proceeded satisfactorily, 


The initiative, however, has come from different directions on different . 


occasions. The interesting programs at the Unjted Nations head- 
quarters in New York in 1952 and 1953 were a joint venture by several 
Chapters. The successful series gf Institutes at the Universities of 
of Illinois and Pennsylvania and at the Carnegie Institute of Tech- 
nology involved cooperative planning among, a number of groups, 
prominent among them being the Business and Economic Statistics 
Section, The regional meeting to be held in Sen Francisco in December, 
1954, will be the responsibility of the Western District. Thus, what was 
perhaps the principal object in setting up District Committees is being 
achieved, although the Committees themselves have not been uniformly 
active. 

The Council. In creating the Council, the intent was to give the mem- 
„bership a larger role in the policy-making of the ASA and perhaps also 
‘to allow for more deliberation on policy problems. 1 think it is fair to 
say that these aims have not been fulfilled thus far. The annual meet- 
ing of the Council takes place at the beginning of the new President's 
term of office, a day or two after the new Council members have been 
elécted. The agenda is a full one, with enough. questions calling for im- 


mediate decision to leave little time or energy for leisurely discussion | 


of long-range policy problems. The Board members tend to be the more 


active participants in the discussion, because they are more familiar - 


with the issues than those who are not Board members. 

Tt can be arguec, of course, that if affairs are running smoothly 
without intense Council activity, as they appear to be, there is no point 
in looking for more work for the Council just. to keep [rw busy. Also, 
“a group with around 30 members is of an awkward size for some types 
of work and deliberation. The Council can meet at ‘other times and can 
be polled by mail, so that it stands ready when any important policy 
matte? arises. On the other hand, since the council is our policy-making 
body, our most representative body, and the body on which nominees 
„ from cther societies will see us in action, there is a strong стве for trying 


e 


PRESENT STRUCTURE OF THE ASSOCIATION ri 


to make it more continuously [tective. There are several techniques 
that would be worth experimentation, and the Board has been consid- 5 
ering a plan of action. I am sorry thát during my term of office I didnote © 
make a*beginninge 4 б ! - 


` e 
: QHE*PRESENT STRUCTURE OF THE ASSOCIATION 


‘As indfcated previously, the wording of the 1948 Constitution sug- 

‚ gests that ће ASA would assume a more definitely central role in sta- 
fisties by establishing, through association, links with other societies 
which. recognized this role for the ASA. Section Committees were 
apparently regarded es more of an interim mechanism, since the Con- 
stitution describes them asipplicable to “felds not adequately covered 
at present by associated or affiliated societies” and regards them asa 
means for organizing am associated society. ? 

As events have turned out, the formation of Sections and Section: 
Committees has been thé predominant feature in the development of 
the ASA during the pasf five years, while only a bare beginning has 
been made in linking ourselves with other sodieties. This has been a 
sound order of procedure, in that we have been working hard to try 
to serve the whole range of statistics, before putting forward claims . 
that we are able to do so. It now looks as if many of our most impor- 
taat activities during the next few yearsewill be in the hands of the, 
Sections. I hope аё members of Section committees will realize how 
important these committees, have become, Theireuseful activity,is by 
no means confined to helping with the program at the Annual Meetings, 
but may include t he plagning of more specialized meetings, contribu- 
tions to the publication program of the АЗА and factual studies of prob- 

: lems thgt confront the content fields. à : 

As the Sections become largé and better established, what wil] be 

* the next step in the evolution of the ASA? In particular, what will - 
happen if a Section developg into ‘Associated Society cr a society 
already in existence in the field of the Séction becomes associated with 

us? I do not know the answer, buf some recent experiences of the 

Biometrics Section are worth noting. cm 

After the North American regions of the Biometric Sogiety had 
been established, "the members of the Biometrics Section began & 
lively discussion of the future of this Section. Some members contended 
that the Biometrics Section should be dissolved. They claimed that the 

new regions of the Biometric Society could take gare of the welfare of . 

biometry in&his country, thmt their administration would to 4 large * 


D 


2 


2 


8 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 19% 


extent be in'the hands of ASA memléers anyway, and that continua- 
tion of the Biometries Section would be an unnecessary duplication,of 
* effort. М ] 

An opposing view was that fox a statistician, membership in the 
Biometric Society serves a different purpose from membership in the 
Biometrics Section. At present, about half the members of the Biometric 
Society are biologists. If this Society is ® flourish in its original objec- 
tives, it must continue to attract to membership a large number, 
preferably a majority, of biologists who would not join any statistical , 
association. Thus the Biometric Society gives the statistician the oppor- 
tunity to talk with biologists, learning their problems, working with 
them, and presenting new techniques for criticism and use. The ASA, 
on the other hand, is the place where statisticians in biometry can talk * 
with statisticians in other content fields, both to find out what new 
techniques have developed in these fields and to present new ideas in 
biometry. From this point of view there was a strong argument for 
continuing the Biometrics Section as a nucleus for attracting future 
biometricians into the» ASA, for cooperating with other Sections and 
for organizing programs on new, or recent discoveries, ‘where the techni- 
cal level would be too high for most biologists. 

After much debate, the decision was taken to continue the Bio- 

› metrics Section. I donot claim that it was the argument given abgve 
which carried the day. Biometricians, like other statisticians, are fond 
of nice logical distingtions, ant each tends to put forward a slightly dif- 
ferent reason for advocating the same lecision, and to attach great 
importance to the superiority’ of his reason, over anyone else's, even 
tHough to an outsider the réasons are practically indistinguishable. 
But I hope that the argument will not be overlodked if other Sections. 
blossom into full societies and their members are uncertain» whether 
to continue the Section. If this concept of the purpose of a Section is- 
sound, the.gazatest benefit will be obtained from the present ASA struc- 
ture only if there i$ sustained'cooperation among Sections and if mem- 
bers make a habit of attendingesessions of several different Sections. - 

There is, of course, nothing to prevent a member from belonging to 
every Section. 

If the oppesing view prevails, and if we are to look forward to seeing 
the Sections disband one Фу one as Associated Societies are formed 
(as might happen if there is a general lack of interest in continuing the 
Sections) then the structure of the ASA will evolve towards something 


> differént. A conservative might comment that it would *hen resemble 


either a jellyfish or an octopus, depending on how one looks at it. More 


m з 
Orr 


Ф 2 


27 > » 


» > » 


PRESENT STRUCTURE OF THE ASSOCIATION 9 


seriously, I do not mean to sugfest that Sections should be kept alive 
ifsthere is no intrinsic life in them. We should, however, have to re- 


examirie the whole problem of the est type of structure for the ASA e| 


under the changed conditiong. Acjually, some types of organization 
that did not involve Sections't all were examined in the initial work 
for the 1948.Corfatitution, but were rejected as being unsuitable in our 
present state of growth. 


* SOME QUESTIONS CONCERNING THE VITALITY OF THE ASSOCIATION 


Fo consider our present structure from a slightly different point of 
view, I would now like to pose а few broad questions which bear upon 
what might pe called the state of health ofthe Association. 

Can the ASA maintain the enthusiastic support of its members? Any 
large and heterogeneows society is likely to find that it is nobody’s 
darling, because the affections of the members are accorded to some 
smaller and more homogeneous group in which they feel more at home. 
As the Association grows ‘larger in its new*role, X may be more difficult 
to give the members a real sense of participation. The Journal and The 
American Statisttcian, as the most tangible benefits from membership, 
have an important part to play, and it is currently planned to supple- 
ment these periodicals from time tọ time with special monographs and 
other publications of interest to the mempers. Meetings of a local or, 
regional character*are a beneficial addition to our Annual Meetings 
as а means of bringing together more of qur members. Our Chapters 
and Sections may accomplish much in giving members a more immedi- 
ate focus for their interegts. Continued’ joint activity by different Sec- 
tions will avoid a* partitioning into self-contained groups that Has 
occurred in some sofieties. In addition, I hope that members will con- 


„tinue toeagree that statistics needs an all-embracing society, and will 
- appréciate thats the Assdciation will inevitably become more diffuse as 


it succeeds in adopting this role. faut 

Can the ASA continue to recruit young members? Tt is relatively pain- 
less for them to enter into membership: students pay only half, the 
regular dues, as do also members under 30 during their first year. The 
office: conducts а continuing campaign to spread information abou 
membership, the £roups approached beinf varied from ‘year to year. 
As in other societies; our offfce finds, that^hofhing succeeds so well as 
@ personal approach from a present member, 80 that it is to ou*mem- 
bers and to the quality of our publications that we must look mainly 

s i 


for a steady gecruitment of young persons: · id 
Does the structure of the ASA encourage younger members, a8 they ma- 
ere 
е е 
e = 

а * 5 . е ы 

x .ê $ e 

e LJ е е 
B = © e 


eto eae . e 
€ 


е 


a? 


10 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


ture, to partiéipate in the running of the ASA? Since the rapid growth of 
statistics is recent, we suffer relatively little from government by the 
?» grey-haired. Nevertheless, many с} our most experienced members are 
heavily burdened with activities п If of scientific societies. For | 
this reason, as well as to keep us supplied with fresh points of view, the 
talents of younger members should be utilized to the fullest extent, 
The Chapters and the Section and District Committees provide the 
first opportunity for younger members to’ undertake responsible tasks. 
For service at the national level on the Council or Board, the problem — 
of introducing new blood is more difficult. In the,elections, whichrare 
by majority vote, my impression is that the candidate who is more 
widely known (and usually *older)4s very frequently the winner. Some- 


thing can be done about this problem both by the Committee on Elec- 1 


tions when they nominate candidates and by.the President when he | 
appoints committees. E І 
Is the ASA able to stimulate new developments in statistics? Some mem- 
bers have expressed the opihion that in the thirties and early forties 
the ASA missed an opportunity by not playing a more prominent part 
in the developments which led to the formation of + number of other 
societies with statistical interests. I am not sure that I would agree. In | 
the Biometric Society, which we did help to establish, I have been 
„Slightly disturbed incase the statisticians should play too prominent 
a role'relative to the biologists. In founding this kind of a society, there — 
is something to be said for Je&ving much of the initiative to the scien- 
tists in the subject-matter field, who would not in general be members _ 
of the ASA. Nevertheless,» our assumption of the role of a central - 
organization with vesy wide interests does carry more responsibility _ 
for helping such developments, rather than leaving them to take place .— 
outside the ASA. ә 3 
Here again we must rely mainly on the Section Committees, pártieu- 
larly wher.they arrange programs, to be on the lookout for new de- 
velopments. Inspection of the wide range of our programs in recent - 
years suggests that the committees have been lively and enterprising in 
` „this respect. The Board and Council and the office can also help. For 
8 time, the Board felt impelled to adopt a cautious policy owing to our 
financial difteulties, but fortunately these appear tò be well out of the 
way. bus 3 i ‘ 3 
Is the ASA able to exercise leadership for statistics as a whole? So far 88 
the use of statistics in government is concerned, our leadership is recog- 
» füized'às a result of a long history of disinterested service 4o agencies of .— 


ғ 


` 


PRESENT STRUCTURE OF THE ASSOCIATION | 11 


е 
the government. I believe that faternational statistical agencies would 
also join in this recognition. 7 
How do we stand in other areas (nvolving statistical interests? Aree“ 
we active enough «n exercising le ership? These questions are more 
troublesome. Two areas that ‘have always been of deep concern to the 
Council and Boatd are that inyolving relations with the public and that 
involvingsStatistical Standargs. A piece of sound and important, statis- 
tical work may be subjetted to unjustified public attack, or а piece of 
* shoddy ånd unserupulous work, masking as statistically sound, may 
threaten tb bring discredit on the profession. Should such cireum- 
stances фгійе, I imagine that most members would expect the Associ- 
ation to take corrective aotion. The problem of doing this effectively 


‚ « raises numerous difficulties. The critical moment for taking action may 


not be clear: there may be varying opinions about the most appropri- 
ate type of action; and the pressure of time may revent thorough study 
of the issue before something must be done. Ln these reasons I am 
doubtful whether reliance on any standy body, such as the Council 
or some designated committee, will be adequate. The analogy with a 
fire brigade is ngt good, because nobody rings the alarm bell to tell us 
when to spring into action. The Council and Board have been strug- 
gling to consider what program of study might be initiated in order to 
establish a set of,principles and a mode of,actionefor dealing with such, 
emergencies so tht we will not be caught unawares. This is & task 
that needs all the help that members cah give. For many of the prob- 
lems it seems clear that to be fully effective, the ASA must work along 
with other societiés tha have statistloa interests, Consequently, a 
program of this кіла may be one means of drawing us closer to thése 


. Societies.” 


Finally, any account of ourepresent structure must recognize that - 


- we afe a volungary organiziftion. Apart from a tiny office staff, every- 


thing that we do depends on the voluntary labor of the reombers. The 
Association can become what thesmemBers want ft to be: there is no 
entrenched bureaucracy to impose its own pattern. Any member with 
a bright idea will receive,an interested hearing (although he may Mec 
times have to talk a little loudly in order to do so). If his idea is bright 
enough, he will vefy likely find himself askÉd to carry it ofitfas an enter- 
prise of the Association. Secfndly, we are^a scientific as distinct from 
a professional society, in the sense that the Association has Slways 


worked for the highest statistical standards rather than for the eco- 


nomic interests of its'membees. As we grow larger, it may be hafder fo © 


еа . съ. 
е os ‘i e 
: cn s M. 
E 
F e А е 
e S eum 
en. oe = . е 


12 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1054 | 


retain this voluntary, scientific characer while representing effectively 
the whole range of statistical activities. For my part, I hope that we 
л 4 


>s can do both. 


| 

To summarize, the ASA is in a dificult period of growth in &rying to 
keep up with an extraordinary expansion of.statistics which scarcely 
anyone could have predicted accurately. In particulir, the increasing 
specialization within statistics has set up.forces which tend tò decrease 
the amount of common interest amongst members апа to split them 
into separate groups. The task of serving all areas»of application ій. 
this rapidly-changing environment will require us to be wide-awake, 
adaptable, and receptive to new ideas and new ventures. My own 
appraisal would be that dufing the past five years our Association has 
made gratifying progress, especially in view of the financial stringencies ` 
which inflation imposed upon us. Some of the provisions of the 1948 
Constitution have had, only modest effects as yet, but these provisions 
have not proved harmful: they create mechanisms that will increase 
our flexibility in adapting dürselves to the future growth of the field 
of statistics. Althougl? much remains to be done, I believe that we 
now have an organizational pattern that at least for the near future 
will enable us to take full advantage of our broad, common interests | 
while giving scope also to our more specialized interests. • 


i; 2 » s т LJ 
y. 
э 
~ >> 5 i 
e ә D 
> 
С] 55 » 
> 2 
2 
* 
ШЕТ m 
= 
шо» 
2 o > 
е 
' 
El D 
D э“ ЕЈ Ф € 
2 3 
» a è 
i9 
D А у 
2 E] 
=? > 
D » , » 
kd 
„ > › 


if . 


. 
WILLIAM С. Cocuean, Обл Hopkins University 
FREDERICK Мовтфл,Ев, Harvard University, 
*Joux W. Tupey, Princeton University 


e 
© E І. ВАМРЬЕЙ AND THEIR ANALYSES 

+ Í. Introduction * 
AETHER by biologists, sociologists, engineers, or chemists, sam- 
plihg is all too ften taken far too lightly. In the early years of 
the present ёепішу it was not uncommon to measure the claws and 
carapaces of 1000 crabs, or to count the number of veins in each of 1000 
leaves, and then to attach tq the results the “probable error” which 
would have been appropriate had the 1000 erfbs or the 1000 leaves 
been drawn at random from the роршафоп of,interest. Such actions 
were unwarranted shotgun. marriages between,the quantitatively un- 
sophisticated idea of sample.as “what you get by grabbing a handful” 
and the mathematical precise notion'of a “simple random sample.” 
In the yearg between we have learned caution by bitter experience. 
We insist on some semblance of méchanical (dice, coins, random num- 
bet tables, etc.) fandomization Before we freat.a sample from аль exist-" 
ent population as if it were random. We realize that if someone just 
“grabs a handful,” the individuals in the*handfül almost always re- 
semble one another (on the average) more than do the members of a 
simple random sample. Even if the *'erabe? are randomly spread arouad 
so that every individual has an equal chance of entering the sample, 
` there afe difficulties. Since the individuals of grab samples resefnble 
‚ one another more than do individuals of random samples, it follows 
(by a simple mathefnatical argument) that the means of grab samples 
resemble one another less then the means of random samples of the 
same size. From a grab sample, thegefore, we tend to underestimate 
the variability in the population, although we should have to over- 


estimate it in order to btain valid estimates of variability of gral” 


sample means by ewbstituting such an estimate into the fezmula for 
the variability of megns of signple random.samples. Thus using simple 


random sample formulas for grab sample méans introduces à double 
E: s 
* This paper will constitute Appendix G of Cochran, Mosteller, apd Tukey, Statistical Problems 


of the Kinsey Repgt, to be published bythe American Statistical Association later this fear as & е 


monograph. The fain body of this monograph was published in the Journal last December (Vol. 48 


(1953), pp. 673-716).* e 
SENA pi edis 
2128s е. 13 2 e 
. е 
e в Ы x » € 
e E e 
` e* A * e A = 


PRINCIPLES T SAMPLING* e^ 


e 


J 


> 


14 AMERICAN STATISTICAL ASSOCIATION JOURNAL, 


bias, both parts of which lead to an wtwarranted appearance о 

. stability. mm 
Returning to the crabs, we ma; suppose that the crabs in м 

are interested are all the individu'lls ОЁ а wide-ranging speciés 
along a few hundred miles of cosst. It4s obviously impractical t 
to take a simple random sample from the species—ao one kn 
to give each crab in the species an equal chance of being dra 
the sample (to say nothing of trying to make these chances | 
pendent). But this does not bar us from honestly assessing the li 
range of fluctuation of the result. Much effort-has been appli 

* recent years, particularly in sampling human population’, to 
velopment of sampling plans which simultaneously, 

(i) are economically feasible 


Any excuse for the dahgerous practice of treating non-random $8 
as random ones is now’entirely tenuous. Wider knowledge of the prin 
ples involved is needed if scientific investigations involving sa 
(and what such investigation does not?) are to be solidly based. 1 
tional knowledge of techniques is not so vitally importaut, though 
‚ can lead to substantial economic gains. . E. 
A botanist who gathered 10 oak leaves from each of 100 oak ti 
might feel that he had a fine sample of 1000, and that, if 500 we 
fected with a certain species of parasites, he had shown that th 
centage infection was close to 50%. If he had studied the bin 
distribution he might calculate a standard error according to the 
, formula for random samples, p++/pq/n, which in this case 
50+1.6% (since p=q=.5 and n=1009). In this doing he would пе 
three things: 7. 5 s 
(i) ЕСЕ selectivity in selecting trees (favoring large trees, pé 
haps?), н 
(ii) Probable selectivity in thoosing leaves from a selected t 
pur (favoring well-colored or, alternatively, visibly infected | 


> Most’scientists are keenly aware of the "analogs of (i) and (ii) in the 
own fields of work, at least as soon as they are pointed out to the 


PRINCIPLES OF SAMPLING ` B 3 245 


е 

Far fewer seem to realize that, even if the trees were selected at random 
frem the forest and the leaves were chosen at random from each 
selected tree, (iii) must still be cojsidered. But if, as might indeed И 
be the c&se, each éree were either holly infected. or wholly free of 
infection, then the 1000 leaves tell us x0 more than 100 leaves, one from 
each tree. (Each group of 10 leaves will be all infected or all free of in- 
fection.) In this case we should take n= 100 and find an infection rate 
of 50+5%., Sine : 

* *Sugh ah extreme case of increased fluctuation due to sampling in 
clusters would be detected by almost all scientists, and is not a serious 
danger. Rut#less extreme cases easily escape detection and may there- 
fore be very dangerous. This is onesexamfile of the reasons why the 
principles of sampling need wider understanding. 

We have just described an example of cluster sampling, where the 
individuals or sampling units are not drawn ipto the sample inde- 
pendently, but are drawnsin clusters, and have tried to make it clear 
that “individually at random” formulas db not apply. It was not our | 
intention to oppose, by this example, the use of cluster sampling, which 
is often desirable? but only to speak for proper analysis of its results. 


` 


2. Self-weighting probability samples Я 5 
There are many ways to dray samples such ¢hat each individual , 


or sampling ‘unit if the population has an equal chance of appéaring 
in the sample. Given such a sample, and desiring te estimate the Ropu- 
lation average of some characteristic, the appropriate procedure is to 
calculate the (unwéighted) mean of all’the individual values of that 


characteristic in the*sample, Because weights arerequal and require no 


„obvious aétion, such'a sample is self-weighting. Because the relative 


chances ef different individuals entering the sample are known and 


-compénsated fog (are, in’ this case, equal), it is a probability sample. 


(In fact, it would be enough if we knew somewhat less, asczeexplained 
in Section 5.) ? s е 

Such a sample need not be a simple random sample, such as one 
would obtain by numbering all the individuals in the population, and... 
then using a table of random numbers to select the sample on the basis: 
one random numbet, one individual. We illfistrate this by’ giving vari- 
ous examples, some practical®and others impractical. 

Consider the sample of oak leaves; it might in principle be drawn 
in the following way. First we list all trees in the forest of interest, . 
recording for each tree its location and the number of leaves it hears. © 
Then we draw a sample of-100 trees, arranging that the probability 

9. € + 


e e Ee 


. Ф е 


е 


e 3 


e (on 


16 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


of a tree's being selected is proportiopal to the number of leaves which 
: it bears. Then on each selected tree we choose 10 leaves at random. 
It is easy to verify that each Іеаіп the forest has an equal chance of 
being selected. (This is a kind ое sampling with pyobability 
proportional to size at the first stage.) * — . 
We must emphasize that such term$ as “select atrandom,” “choose 
at random,” and the like, always meah that some mechanical device, 
such as coins, cards, dice, or tables of random numbers, is used. 


A more practical way to sample the oak leaves might be to list only f 


the locations of the trees (in some parts of the gountry this could be 
done from a single aerial photograph), and then to drawe 100 ‘trees in 
such a way that each tree*has ag equal chance of being selected. The 
number of leaves on each tree is now counted and the sample of 1000 
is prorated over the 100 trees in proportion tq their numbers of leaves. 
It is again easy to verify that each leafshas an equal chance of appear- 
ing in the sample. (This is a kind of two-stage sampling with prob- 
ability proportional 40 sizefat the second stage.) 

If the forest is large, and each tree has many leaves, either of these 
procedures would probably bg impractical. A more practical method 
might involve a four-stage process in which: 

(a) the forest is divided into small tracts, 

(b) each tract is divided into trees, i 

(c} each tree is divided into recognizable parts? perhaps limbs, ‘and 

(d) each part,is divided into leaves. 

In drawing a sample, we would begin by drawing a number of tracts, 
then a number of trees ingaéh tract, then a part or number of parts 
from each tree, them a nuñBer of leaves from eath part. This can be 
done in many ways so that each leaf has an equal chance of appearing 
in the sample. : = = 

A different sort of self-weighting probabiltty sample arises when we 
draw a sample of names from the Manhattan telephone directory, 
taking, say, every*17,387th mame in alphabetic order starting with one 
of the first 17,387 names selected at random with equal probability. 
_ lt is again easy to verify that every name in the book has an equal 

chance of appearing in the sample (this із a systematic sample with a 

random Start, sometimes referred to as a systerfatic random sample). 

As a final example of this sort, we n*ay consider a national sample 

of 480 people divided among the 48 states. We cannot divide the 480 
cases among the individual states in proportion to pépulation very 
^well*since Nevada ‘would then receive $bout one-half Qf a case. If we 


e . 


| 
| 
| 

E 


* 


17 


Pup ‘tie small states into blocks, however, we can arrahge for each 

state or block of states to be large enough so that on a pro rata basis · 

it will Have at least 10 cases. Then [we can draw samples within each 4 

state or block of states in varjous vays. It is easy to verify that the 

chances of any two persons entering such a sample (assuming adequate 

randomness within each state or block of states) are approximately 

the- same, where the approximation arises solely because a whole num- 

ber of cases. has to be assfgned to each state or block of states. (This is 
. a rudimehtary sost of stratified sample.) 

Ай of thése examples were (at least approximately) ‘self-weighting 
probabilify Samples, and all yield honest, estimates of population char- 
acteristics. Each one requires a diffeunt fofmula for assessing the sta- 
bility of its resylts! Even if the population characteristic studied is a 
fraction, almost never will 


PRINCIPLES OF SAMPLING R 


. PY е 
СРЕ nities 

be a proper expression for “estimate + standard error.” In every case, 
a proper formula ‘will require more infermation from the sample than 
merely the overall percentage. (Thus, for instance, in the first oak 
leaf example? the variability from tree to tree of the number infested 
out of 10 would һе needed.) • ° H ; 


3. Representativeness "E. y 1 
Another principle which ought not to need recalling is this: By sam- 
pling we can learn Only about collectivé properties of populations, not 
about properties of individuals. We can ‘study tire average height, tlle 
-Percentagé who wear hats, or the variability in weight of college 
juniors, or of University of Indiana juniors, or of the juniors belonging 
-to a ctrtain fraternity or club at a certain institution. The population 
we study may be small or large, but there must be a popelation—and 
what we are studying must LY а populition charf&cteristic. By sam- 
pling, we cannot study individuals as particular entities with unique 
idiosyncrasies; we can study regularities (including typical variabilities - 
as well as typical levels) in a population as exemplified by the individ- 
uals in the sample? * ө "et 
Let us return to the self-ffeighted.natiónal sample of 480. Notice 
that about half of the times that such a sample is drawn, there Will be 
no one in it from Nevada, while almost never will there be anyone from 
Esmeralda County im that state. Local pride might argue thaf “this • 


> e 


18 ~ ' AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 10954 


proves that the sample was unrepresentative,” but the correct position 
seems to be this: : 
7^ (i) the particular persons in {Ле sample are there by accident, and 
this is appropriate, so far 9s population characteristics are con- 
cerned, e n 
(ii) the sampling plan is representative since eacb-person in the U.S. 
had an equal chance of entering the sample, whether he came 
from Esmeralda County or Manhattan. К à 
"That which can be and should be representative is the sampling, plan, 
which includes the manner in which the sample was drawrf (essentially 
а specification of what other samples might have, been drawn’and what + 
the relative chances of selettion were for any two possible samples) and 
how it is to be analyzed. á ў 

However great their local pride, the citizers of Esmeralda County, 
Nevada, are entitled, to representation in a national sampling plan 
only as individual members of the US. population. They are not en- 
titled to representation as 4 group, or as particular individuals—only 
as individual members of the U.S. population, The same is true of the 
citizens of Nevada, who are represented in only half of the actual 
samples. The citizens of Nevada, as a group, are no more and no less 
entitled to representation than any other group of equal size in the 

., U.S. whether geographical, racial, marital, criminal, selected, at 
random, or selected from those not in a particular’national sample. 

It is clear thas, many such groups fail to be represented in any par- 
tieular sample, yet this is not a criticism of that sample. Representa- 
tion is not, and should not: bé, by groups. It is, and should be, by in- 
dividuals as members:of the sampled population. Representation is not, 
and should not be, in any particular sample. It is’ and should be, in the 
sampling plan. j 

x t oe е 
4. Опе method of assessing stability 

Because represeAtativenes§ is inherent in the sampling plan and not 

~in the particular sample at hand, we can never make adequate use of 
-Sample results without some measure of how well the results of this 


which thé ‘sume sampling^plan might have provided, The ability to 
assess stability fairly is a8 important #5 the ability to represent the 
population fairly. Modern sampling plans concentrate on both, 


» these are usually our. most reliable source of information about the 
population. There is no reason, however, why assessment should de- 
5 > yu oes 
2) "E = Y 
558 i. CE 


2 Ар >» 


| 


PRINCIPLES OF SAMPLING n 19 3 


pend only on the sample size and the overall (weighted) Sample mean 
for the characteristic considered. These two suffice when measuring 
percentages with a simple random; sample, but in almost all other © 
cases thé situation ás more Bore] ў 
It-would be too bad if,.every,time;guch samples were used, the user 
had to consult á'icomplieated table of alternative formulas, one for 
each plang before calculating fis standard errors. (These formulas do 
need to be corisidered whenever we are trying to do a really good job 


. of maximum stability for minimum cost—considered very carefully in 


selecting ofe complex design in preference to another.) Fortunately, 
however,.thts complication can often be circumvented. 

One of the simplest ways4s to build up tHe sample from a number of 
independent suksamples, each of which is self-sufficient, though small, 
and to tabulate the results of interest separately for each subsample. 
Then variation among separate results gives a simple and honest yard- 
stick for the variability of the result or results obtained by throwing 
all the samples together. Such a sampling Plan involves interpenetrating 
replicate subsamples. е 

All of us can visualize interpenetrating replicate subsamples when 
the individuals or sampling units are drawn individually at random. 
Some examples in more complex cages may be helpful. In the first oak 
leaf example, we might select randomly, not onesample of 100 trees, , 
but 10 subsamples ôf 10 trees each. If we then pick 10 leaves at rahdom 
from each tree, placing them in 10 bags, “one for each subsample, and 
tabulate the results separately, bag by bag, we will have 10 inter- . 
penetrating replicafe subsamples. Similérly, if we were to pick 10 sub- 
samples out of the Manhattan phone vel with each subsample cofi- 
sisting of évery 173,870th name (in alphabetic order) and with the 10 
lead names of the 10 subsamples selected at random from the first 


‘178,870 names we would ‘again have 10 interpenetrating replicate sub- 


samples. cu 
We can always analyze 10 results from 10 indeperflent interpenetrat- 
ing replicate subsamples just as if they were 10 random selected indi- 
vidual measurements and proceed similarly with other numbers of_ 
replicate subsamples. i 
* е oe 
5. General probability samples» Р 


_The types of sample described in the last section are not thé only 
kinds from which we can confidently make inferences from the sample 
to the population’ of interest. Besides the trivial cases where the simple е 
amounts to 90% or even 95% of the population, there is а broad class 

ror ire = е: 
„ e - T 


7 


20 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


0 
of cases, including those of the last section as special cases. This is th 
class of probability samples, where: | i 

“ (1) There is a population, the3sampled population, from which the 


sample is drawn, and each ¿lement of which has some ¿hance of { 


entering the sample. y i i 
(2) For each pair of individuals or sampling units which are in the 
actual sample, the relative chances of their entering the sample 


are known. (This implies that the sample was selected by a pros- 


ess involving one or more steps of mechanical randomization.) ' 
(8) In the analysis of the actual sample, these relative chances ħave 
been compensated for by using relative weights such that 
(relative chance) times (zelative weight) equals a constant. 
(4) For any two possible samples, the sum of the reciprocals of the 
relative weights of all the individuals inthe sample is the same. 
(Conditions (3) and (4) can be generalized still further.) In practice of 
course, we ask only that these four conditions shall hold with a suffi- 
ciently high degree of"approkimation. A 

We have made the sampling plan representative, not by giving each 
individual an equal chance to.enter the sample arfd then weighting 
them equally, but by a more noticeable process of compensation, where 
those individuals very likely to enter the sample are weighted less, 

, while those unlikely 40 enter are weighted more whey they do appear. 
The net result is to give each individual an equal chance of affecting 
the (weighted) sample mean? 

Such general probability samples are just as honest and legitimate as 
the self-weighting probability samples. They often offer substantial ad- 
våntages in terms of higher stability for lower cost? 

We can alter our previous examples, so as to make them examples of 


general, and not of self-weighting, probability samples. Take first the 


oak leaf example, We might proceed as follows: ,  , 
(1) locate-all the trees in the forest of interest, 
(2) select a sample of trees'at random? 
(3) for each sampled tree, choase 10 leaves at random and count (or 
å estimate) the total number of leaves, , 
(4) form the weighted mean by summing the products ы 
45 (fraction of 10 leaves infested) times 
(nuthber-of leaves on the *tree) 


"and then divide by the total number of leaves on the 100 trees 5 


in the sample. à 
> Wher"we selected trees at random, eachree had an equal probability 
of selection. When we chose 10 leaves from a tree at random, the 
chance of getting a particular lerf was VS 
э 1 B 
at ME DES 5 


1 


y PRINCIPLES OF SAMPLING h 21. 
$ & 


. 


^ 


2 


10 i 


(number of leaves on the tree) ) 


\ 


Thus the’ chance of selecting any ote leaf was a constant multiple of 
this and was proportional to the reciprocal of the number of leaves of 
the tree. Hence thé correct relatjve weight is proportional to the number 
of leaves oh the tree, and it is simplest to take it as 1/10 of that num- 
ber. After all, summing the products 
> ' . ` (fraction of 10 infected) times (leaves on tree) 
p. c *s 
(1/109 times (number out of 10 infected) times (leaves on tree) 
over all trees-in the sample*gives the same answer. One-tenth of this 
answer is given by summing 
(1/10) times (number out of 1 infected) times (leaves on tree) 


or e 
HB 


(laves en tree) 
t fo 


» which shows that the weighted mean prescribed above is just what 
would have been obtained with relative weights of (number of leaves 
on free) /10. aS i . . ы 

If in sampling thè names in the Manhattan telephone directory, we 
desired to sample initial letters from P through Z more heavily, we 
might proceed as follows: . ~ 

(1) Select one of*the first 17,387 pamfs at random with equal prob- 

ee. е 


(number out of 1 infected) 


ability as the*lead name. . 
(2) Také the lead*name, and every 17,387th name in alphabetic 
S order following it, into theesample. 


(8) Take evegy name whith begins with P, Q, R, S,--+,Z and is 
the 103rd or 207th name after a name selected in step 2 of the 
e 


sample. : a s 
Each name beginning with A, B, - + е, N, O has a chance of 1/17,387 
of entering the sample. Each name beginning with P, Qu Y, Z 


has a chance of 3/17,387 of entering the sample (it enters if any one of 
three names among*the first 17,387 is selected as the lead néfe). Thus 
the relative weight in the samble of a name'beginning with A, B, * * , 
N, O is 3 times that of a name beginning with P, Q, - - -, Y, 4 The 


Weighted mean is found simply as: 3 к 


" LJ 
3(sum for A, B, ---, N, O's) + (sum for P, Q,---, Y, Z8) . 
3(A, В, --- ‚N, U's in sample) + P, Q, - - - , Y, 2’s in sample) 
. е. е 
- é d 


22 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1054 


` Finally we may wish to distribute our national sample of 480 with 
10 in each state. The analysis exactly parallels the oak leaf case, and 
> we have to form the sum of s 
i (mean for state sample)]trmes (population of state)" 
and then to divide by the Population of the U.S. 


{ 
6. Nature and properties of general provability samples 
We can carry over the use of independent interpenetrating replicates 


to the general case without difficulty. We need only remember that the . 


replicates must be independent. In the oak leaf example, tie replivates 
must come from groups of independently selected trees. In the Man- 
hattan telephone book example, the replicates must be based on inde- 


pendently chosen lead names; in the national sample, the replicates : 


must have members in every state. In every case they must interpene- 
trate, and do this in pendently. ? 

Tt is clear from discussion and examples that general probability 
samples are inferior to self-Weighting probability samples in two ways, 
for both simplicity of exposition and ease of analysis are decreased! If 
it were not for compensating advantages, general probability samples 
would not be used. The main advantages are: 


(1) better quality for less cost due to reduction in administrative 


costs or prelisting cost, х ] 
(2)? better quality for less cost because of better allocation of effort 
, Over strata, „ th Ў 
(8) greater Possibility of making estimates for individual strata. 
All three of these advantagos can be illustrated on our examples. In the 


Section 2, there is no need to determine the sizé (number 'of leaves). 


of all trees. This is a clear cost reduction, whether in money or time. 


Buppose that, in the Manhattan telephone book sample, one aim was: 


Itisperhaps worth mentioning at this point that, if cost is: proportional 
» to the total number of individuals withaut regard to number of strata 


— ш 


PRINCIPLES OF SAMPLING i ; d 23 


or the füistribution of interviews among strata, the optimum allocation 
of interviews is proportional to the product ~ ex | 
(size of stratum) times (standard deviation within Beatin), D 2s 
In partiextlar, optimgim allocation P for sample strata notin propor 
tion to population strata. If үе weightappropriately, disproportionate | 
| samples will ke better than Wa guise ones—if we choose the dis- 
| proportions wisely. ` 
Jn specifying the chara¢teristics of a probability sampling at the be- 
ginning of this paper, we required that there be a sampled population, 
a population from whjch the sample comes and each member of which 
"7 hasa chancesof entering the sample. We have not said whether or not 
this is exactly. the ваше popjilation ag the population in which we are 
e .interested, thé target population. In practice they are rarely the same, 
| though the difference is frequently small. In human sampling, for ex- 
| ample, some persons cannot be.found and others refuse to answer. The 
issues involved in this difference between sampled population and 
target population are diseussed at someength in Part II, and in 
chapter III-D of Appendix D in our complete report. 


| wi Stratification and adjustment Fs 
^ In many cases general probability samples can be thought of in 
terms of D 
(f: a Subdivision of the population into strata, » 


(2) a self-weighting probability sample in each stratum, and —. 
(3) combination of the stratum samplo means weighted by the'size 
of the stratunt. 
The general Manhaftan telephone NU be so regarded 
There are two strata,fone made up of names beginning in A, A 
N, 0, and,the other iut up of games beginning in P, Q, - ТҮ, ez. 
Similarly the general natibna£ sample may be thought of as Gnas: йр 
of 48 strata, one for each state. 

- This manner of Idoking at g@neral probability sansples is neat, often 
helpful, and makes the entire legitimagy-of unequal weighting clear in 
Many cases. But it is not general. For in the general oak leaf example, 
if there, were any strata they would be whole trees or parts of trees. 
And not all trees were*sampled. (Still every keaf was fairly refwvesented 
by its equal chance of affecting the weighted sample mean.) we can- _ 
not treat this case as one of simple stratification. 

The stratified picture is helpful, but not basic. It must fail as soon 
as there are more potential strata than sample elerhents, or as soon аз“, 
2 EV 

e е a 


* SS fa жс 


24 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARC! 


the number of elements entering the sample from a certain stratum 
not a constant of the sampling plan. It usually fails sooner. There ist 
substitute for the relative chances,that different individuals or sampling 
units have of entering the ватар}. This is the basic thing tot consider, 
There is another relation of stratification to probability sampling, ' 
When sizes of strata are known, there is a possibility of adjustment, — 
Consider taking a simple random sample of 100 adults іп а tribe where д 
exactly 50% of the adults were known to bé males and 50% females 
Suppose the sample had 60 males and 40 females. If we followed thé 
pure probability sampling philosophy so far expounded, we skould 
take the equally weighted sample mean as our estimate of the popula- — 
tion average. Yet if 59 ofthe 6Q,men had herded sheep at some time 
in their lives, and none of the 40 women, we should he unwise in esti- 
mating that 59% of the tribe had herded sheep at some time in their x 


lives. The adjusted mean Й 
s d so( чо, 
BG NCO aus ОАО ио. 


is a far better indicator of what we have learned. ] 
How can adjustment fail? Under some conditions the variability of | 
the adjusted mean is enough greater than that of the unadjusted mean — 
to offset the decrease in bias, It may’be a hard choite between adjust- 
ment and nonadjustment. , р i 
The last exaniple was extreme, and th» unwise choice would be made 
by few. But, again, less extreme cases exist, and the unwise choice, 
whether it be to adjust or rot to adjust, may Ъз made rather easily 
(and probably has been made many times). А quantitative rule 
needed. One is given in chapter Ү-С of the complete repost. In the P 


pieceding example the relative sizes of the strata were known exactly. 


Tt turns out that inexact knowledge can be inclided’in the computa- 
tion without great, increase in complexity. з : 
An example in Kinsey's area is cited by one critic of the Kinsey report: 


These weighted estimates do not, of course, reflect any population changes. 


since 1940, which introduces some error into the statistics for the present - 


total population. Moreover, on some of the very factors that Kinsey demon- 
strates to be correlated 


with sexual behavior, there are no Census data 
available. For example, religions membelship is shown to be a factor affect- 
ing sexual behavior, but Census data are lacking and no weights are assigned. — 
While the investigators interviewed members of various ieligious groups, - 
there is no assurance that each group is proportionately represented, be- _ 
cause of the lack of systematic sampling controls, Thus, the proportion of 


25 


Jews in Kinsey’s sample would seem to be at least 13 per cent*whereas their 
x true proportion in the population is of the order of 4 per cent. 


PRINCIPLES OF SAMPLING 1 


Do we know the percentage of Jews well enough to make an adjust- 

ment for $t? If we oan assess the stability of the “4%” figure, the pro- 

cedure of Chapter V-C will answer thisequestion. Failing this technique, 

we could translate the questigh into more direct terms as follows: 

*In.considering Kinsey’s results, do we want to have 13 per cent 

Jews or 4 per cént Jews in'the sampled population?” and try to answer 
. With the did of general knowledge and intuition. 

We have discussed the adjustment of a simple random sample. The 
same congid&rations apply to the possibility of adjusting any self- 
weighting or general probability sample. Nb new complications arise 

* «when adjustment is superposed on weighting. The presence of a com- 
plication might be suspeeted in the case where not all segments appear 
in the sample, and we attempt to use these segments as strata. Careful 
analysis shows the absence of the complication, as may be illustrated 
by carrying our example farther. | * be 

Suppose that the sheep-herding tribe in question contains a known, 
very small percentage of adults of indeterminate sex, and that none 

* have appeared in our sample. To be sure, their existence affected, albeit 
slightly, the ohances of males and fgmales entering the sample, but it 
doeg not affect the thinking which urged us фо take the adjusted mean. 
We still want to-adjust, and have only the question “Adjust for 
what?” to answer. Te ee К 

If the fraction of indeterminate sex is 0.000002, and the remainder 
are half males afid half females, and jf ої anthropological expert feels 
that about 1 in 7 of te indeterminate ones has herded sheep, we have я” 

- choice between 5 


E . 499999 б + 499909 (2) + -000002 (=) 


which represents adjustment for three strata, one measured subjec- 
tively, and ° 


5 0000 28 кою.) 
vnam (75) + -500000 = 


‘ . е y 
Which represents adjustment for the twó observed strata. ^ 
RES S id 3 А 
Clearly, in this extreme example, the choice.is immaterial. Clearly, 
Jour yman, Н. H. and Sheatsley, Р. B. “The Kinsey report and survey facthodology,? Internfltonal © 
curnal of Opinion аяй Attitude Research, Vol. 2 (1948), 184-85. * 


e, T€ 


e 


e 


a 


cot 


r 


-26 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


also, the estimated accuracy of the anthropologist's judgment must 
enter. We can again use the methods of Chapter V-C. д 


8. Upper semiprobability sampliny ха ) 

Let us be a little more realistic about'our botanist and his sample of 
oak leaves. He might have an aeria photograph, and Ье willing to 
select 100 trees at random. But any ladder he takes into ithe field is 
likely to be short, and he may not be willing to trust himself in the 
very top of the tree with lineman's climbing irons. So the sample ‘of , 
10 leaves that he chooses from each selected tree, will not be chosen at 
random. The lower leaves on the tree are more likely to be cliosen than 
the highest ones. р i t i 

In the two-stage process of sampling, the first stage has been a prob- , 
ability sample, but the second has not (and may even be entirely un- 
planned!). These are the characteristic features of an upper semiprob- 
ability sample. As a consequence, the sampled population agrees with 
the target populatior: in ceztain large-scale characteristics, but not in 
small-scale ones and, usually, not in other large-scale characteristics, 

Thus, if in the oak leaf example we use the weights appropriate to 
different sizes of tree, as we should, the sampled population of leaves will 3 

(1) have the correct relative number of leaves for each tree, but 

(2) will have far too many lower leaves and far too few upper leaves. 
The large-scale characteristic of being on a particülar trée is a matter 
of agreement between sampied and target populations. The large-scale 


sample to sampled population will be correct, they may be useless or 
misleading because of. the great difference between sampled popula- - 


| tween trees, and this is good. But we have done nothing about almost 
> certain selectivity between leaves on 3 particular tree—this may be 
all right, or very bad. It would be nice, to always have probability і 
E хә 


3 y Y 
j E S 


PRINCIPLES OF SAMPLING 27 


. e 
sample’, and avoid these difficulties. But this may be impractical. (The 
conditions under which a nonprobability sample may reasonably 
be taken. are discussed in Part IL.) 

There ig one point which needs to by stressed. The change from prob- 
ability sampling within segments (in the example, within trees) to some 
other type of, sampling, perhaps even unplanned sampling, shifts a 
large and sometimes difficult Sart of the inference from sample to 
target population—shifts it by moving the sampled population away / 
fróm the target population toward the sample—shifts it from the 
"shoulders of, the Statistician to the shoulders of the subject matter 
“expert.” Those who use upper semiprobability samples, or other 
nonprobability samples, take a heavier load pn themselves thereby. 

Upper semiprobability samples maj be either self-weighting or gen- 


* eral. The “quota samples” of the opinion pollers, where interviewers are 


supposed to meet certain quotas by age, вех, and socioeconomic status, 
are rather crude forms of upper semiprobability s&mples, and are often 
self-weighting. Bias within segments ariseg, some contribution being 
due, for example, to the different availability of different 42 year old 
women of the middle class. The sampled population may contain sexes, 
ages, and socioeconomic classes in the right ratios, but retiring persons 
are under-represented (and hermits are almost entirely absent) in 
comparison with the target population. 

Ftection samples ef opinion, although following the same quota, pat- 
tern, will ordinarily only be self-weighting within states (if we ignore 
the “who will vote” problem). Predictions*are de&iretl for individual 
states, If Nevada had a mere 100 cases in a self-weighting sample, the 
total size of a national saniple would havedo be about 100,000. Whey 
national percentages «аге to be compiled, it would be foolish not to 
Weight eath state mean in accordance with the size of the state. No ene 
would favor, we believe,» wejghtmg each state equally just because 
there may be (ad probably are) biases within each state. 

Disproportionate samples ane unequal weights are just as natural and 
wise a part of upper semiprobability sampling as they are of prob- 
ability sampling. The difficulties of upper semiprobability sampling do 
not lie here; instead they'lie in the secret and insidious biases due to 
selectivity within segments. = P 

Our sampling of names frgm the Manhattan telephone directory 
might conceivably be drawn by listing the nunibers called by subscrib- 
ers on a certain exchange during a certain time, and then taking into 


the sample names from each exchange in proportior to the names listede x 


for the exchange. The result would be an upper semiprobability sample 
a 2 7 : = * 


ҮЗ = s e © 


» 


= . € 


28 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 
3 


with substantial selectivity within the segments, which here/are ex- 
changes. The nature of this selectivity would depend on the time of 
» day at which the listing was made. 

Whether all segments are rep jesented in an upper semipirobability 
sample or not, the segments may be used as strata for adjustment, The 
situation is exactly similar to that fo: probability sampling. The only 
difficulty worthy of note is the difficulty of assessing the stability of the 
various segment means, › 

Independent interpenetrating replicate subsamples cari be used ‘to 
estimate stabilities of over-all or Segment means in upper semipróbabil- 
ity samples without difficulty, if we can obtain a reagongble facsimile 
of independence in taking the different Subsaniples. They provide, if 
really independent, respectable’ bases for inference from sample to 
sampled population. We still have a nonprobability sample, however, 
and there is no reason for the sampled population to agree with the 
target population. ‘Phe problem is just reduced to “What was the 
sampled population?” g 4 

What finally is the, situation with regard to bias in an upper semi- 
probability sample? We shall have a weighted mean or an adjusted one. 
In either case, any bias originally contributed by selectivity between E 
segments will have been substantially removed. But, in either case, the 
contribution to bias due to selectivity within Segments will remain un- 
changed. This is an unknown and herice additionally dangerous, sort of 
bias. E 
The great danger in weighting or adjusting such samples is not so 
much that that weighting or adjusting may make the results worse (as 
it will from time to time) ba rather that itg use raay cause the user to 
feel that his values are excellent because they are “weighted” or “ad- 
justed” and hence to neglect possible or likely biases within ‘segments. 
Like all other nonprobability sample resuks, weighted means from 


upper semiprobability samples should be presented. and interpreted 
with caution. » T D > 


9. Salvage of unplanned samples 


What can we do for such samples? We can either try to improve the 
results oftheir analysis, cr try to inquire how good they are anyway. 
We may try to improve seither actual quality, or our belief in that 
quality. The first has to be by Way of manner of weighting or adjust- 
ment, the second must involve checking sample characteristics against 

, Population characteristics, 


А PRINCIPLES OF SAMPLING 29 


Weighting is impossible, since we cannot construct a sampling plan 
and hence cannot estimate chances of entering the sample in any other 
manner ‘than by observing the sample itself. So all that we can do 
under thi head is tg adjust. We recall the salient points about adjust- 
ment, which are the same in aegomplete salvage operation as they are 
in any other gituation:’ { 

(1) The population is divided into segments. 

(2) Each individual in the sample can be uniquely assigned to a seg- 
*. ment? 

(3) "The population, fraction i is either known with inappreciable error 

or estimated with known stability. 

(4) The procedures 8f Chapter V-C gf Appendix C of the complete re- 

port aré applied to determine whether, or how much, to adjust. 
After adjustment, what i 4s the situation as to bias? Even worse than 
with upper semiprobability sampling, because if we do not adjust, we 
cannot escape bias by turning to weighting. In summary 

(1) whether adjusted or.not, the result éontains all the effects of all 

the selectivity exercised within segments, while 
(2) if adjustment is refused by the methods of Chapter V-C, we face 
Г а additional biases resulting from selectivity between segments of 
a magnitude comparable with the difference between unadjusted 
and adjusted mean. Д 
"hi$ is, to put it msldly, not a good situation. . 
Clearly even more:caution is needed inepresenting and interpreting 
the results of a salvage operation on an unplanned Sample thar for 
any of the other types of sample discussed previously. (If it were not 
for the psychological danger that adjustment might be regarded as 
cure, the caution reGuired for results based on the original, unad- 
~ justed, unplanned sample would, however, be considerably greater.) 

. Having adjusted or not Eme best, what else can we do? Orfly 
~ something to make ourselves feel better about the sample. Some other ' 

characteristic than ‘that unde? study cam sometimes be compared i in 

the adjusted sample and in the population. A large difference is evi- 
dence of substantial bias within segments. Good agreement is comfort- 
ing, and strengthens the* believability of the adjusted mean for the 
characteristic of interést. The amount of this strengthening: depends 
very much on the a priori relation betweensthe two characteristics. 
Some would say that an unplanned sample does not deserve adjust- 
« Ment, but the*discussion in Part II indicates that if any sort of a sum- 
mary is to be made, it might ag well, in principle, bê an adjusted mean.* e 


го o 


E e 
[I 
е е 3 


es é eve Te e e. $ 


30 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


y П. SYSTEMATIC ERRORS 


In order to understand how systematic errors in sampling should'be 
treated, it seems both necessary and desirable to fall back on the 
analogy with the treatment of systematic errors ia measurement. No 
clear account of the situation for sampling seems to be available in the 
literature, although understanding of ihe issues is a/prercquisite to the 
critical assessment of nonprobability samples. On the other hand, one 
of physical science’s greatest and more recurrent problems is the trest- 
ment of systematic errors. г ; 


10. The presence of systematic errors ў ES 


Almost any sort, of inquiry that is general and not particular involves 


both sampling and measurement, whether its aim is to measure the , 


heat conductivity of copper, the uranium content of a hill, the visual 
acuity of high school boys, the social significance of television or the 
sexual behavior of the (white) human (U.8.) male. Further, both the 


' measurement and tho sampling will be imperfect in almost every case. 


We can define away either imperfection in certain cases. But the re- 
sulting appearance of perfection is usually only an illusion. 


We can define the thermal conductivity of a metal as the average , 


value of the measurements made with a particular sort of apparatus, 


, calibrated and operated in a specified way. If the average is properly 


specified, then there is no “systematic” error of measnrement, Yet even 
the most operational of physicists would give up this definition when 
Presented with a new type of apparatus, which standard physical 
theory demonstrated to be,less susceptible to error. 
> We can relate the.result-bf a sampling operatión to “the result that 
would have been obtained if the same persons йай applied the same 
méthods to the whole ‘population.”.,But we want to know about the 
Population and not about what we would fid by certain methods. In 
almost all cases, applying the method to the “whole” population would 
miss certain persens and units, D ; à 
Recognizing the inevitability of (systematic) error in both meas- 
urement and sampling, what are we to do? Clearly, attempt to hold the 
combined effect of the systematic errors down to a reasonable value. 
What is‘Yeasonable? This must depend on the “test of further reduc- 
tion and the value of accurate results. Mow do.we know that our sys- 
temetic errors have been reduced sufficiently? We don’t! (And neither 


р 


D : 31 
for thé observations to provide. The result is not foolproof. We may 
learn new things and do better later, but who expects the last words on 
any subject? i 
In 1965, a physicist measuring the thermal conductivity of copper 
would have faced, unknowingly, a very small systematic error due to 
the heating ef his equipment gad sample by the absorption of cosmic 
rays, theneunknown to physics. In early 1946, an opinion poller, study- 
ipg Japanese opinion as fo who won the war, would have faced a very 


PRINCIPLES OF SAMPLING 


а “small, systematic error due to the neglect of the 17 Japanese holdouts, 


who were discovered later north of Saipan. These cases are entirely 
parallel. Soeial, biological and physical scientists all need to remember 
that they have the same eproblemg the ‘main difference being the 


s « decimal place in which they appear. 


Р 


If we admit the presence of systematic errors in essentially every 
case, what then distinguishes good inquiry from bad? Some reasonable 
criteria would seem to bez 

(1) Reduction of exposure to systemůtic ertors from either meas- 
urement or sampling to a level of unimportance, 4f possible 
and economically feasible, otherwise : 

(1+) Balancing the assignment of available resources to reduction 

in systematic or variable, errors in either measurement or 

Б sampling reasonably well, in order toeobtain a reasonable , 

amóunt of'information for the “money.” 5 

(2) Careful consideration of possiblé sources оѓ error and careful 
examination of the nümerical results. 

(3) Presentation of results and inférences in a manner which ade- 
quately points out both observéd variability and conjecturéd 
exposure to systematic error. : 

In maay situations it is easy,sand relatively inexpensive, to reduce 

-the systematic errors in"sarfipling to practical unimportance. This is 
done by using a probability sampling plan, where the chance that any 
individual or other primary whit shall erfter the safhple is known, and 
allowed for, and where adequate randomness is ensured by some 
Scheme of (mechanical) randomization. The systematic errors of such 
а sample are minimal, and frequently consist of such items as: 

(a) failure of inflitiduals or primary urfits to appear orf the “list” 

from which selection has been made,” . 

(b) persons perennially “not at home” or samples “lost,” e 

(c) refusafs to answer or breakdowns in the measuring device. p 
These are the hard core of eauses of systematic error in sanfpling. • 
Fortunately, in many situations their effect is small—there a prob- 

s e 55 : id e 
Т ҮЗ > » oe e Jsi 


© 


» 


D 


32 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


ability sample will remove almost all the Systematic error due to 
sampling. í 
11. Should a probability sample be taken? ) 


a D) 


But this does not mean that iù is alwàys good policy to take prob- 
ability samples. The inquirer may пође able to “afford” the cost in 
time or money for a probability sample. The opinion pollers do not 
usually afford a probability sample (instead of designating individuals 
to be interviewed by a random, mechanical process, they allow their“ 
interviewers to select respondents to fill “quotas”) and many kave 
criticized them for this. Yet the behavior of the few "probability 
samples in the 1948 election (see pp. 110-112 of The Pre-election 
Polls of 1948, Social Science Research Council Report No. 60) does 
not make it clear that the opinion pollers should spend their limited 
resources on probability samples for best results. (Shifts toward & 
probability sample have been promised, and;seem likely to be wise.) 

The statement *he^didn'f use a probability sample” is thus not a 
criticism which should end further discussion and doom the inquiry to 
the cellar. It is always necessary to ask two questions: 

(a) Could the inquirer afford a probability sample? 

(b) Is the exposure to systematic error from a non-probability 

б sample small enough to be borne? | ° 

If the answer is “по” to both, then the inquiry should not be, or 
have, been, made—just as wòuld be the case with a physical inquiry 
if the systematic errors of all the form’ of measurement which the 
physicist could afford were unbearably large, a 

If the answer is “yes” to the first question and “no” to the second, 
then the failure to use a probability sample is very serious,’ indeed. 

If the answer is "yes" to both, themeareful consideration of the eco- 
nomic balance is required—however it shodld be incumbent бп the 
inquirer using a nonprobability sample to show why it gave more 
information per dóllar or pef year. (As Statisticians, we feel that the 
onus is on the user of the nonprobability sample. Offhand we know of 
no expert group who would wish to lift it from his shoulders.) 

If the answer is “no” to the first question, and “yes” to the second, 
then the übpropriate reaction would seem to be “hicky man.” 

Having admitted that the sampling, as well’as the measurement, 
will have Some systematic errors, how then do we do our best to make 
good inferences about the subject of inquiry? Sampling ånd measure- 

> ment/being on the same footing, we havexnly to copy, for the sampling 
area, the procedure which is well established and relatively well under- 
„ Stood for measurement, This prccedure runs about as follows: 
> ^ - z Y Ў D 


» 2 ИС 


PRINCIPLES OF SAMPLING n 33 


We hdmit the existence of systematic error—of a difference between 
the quantity measured (the measured quantity) and the quantity of 
interest (the target quantity). We ask the observations about the meas- ' 
ured qu&ntity. We ask our subject matter knowledge, intuition, and 
general information about the ‘relation! between the measured quantity 
and the target quantity. é 

We can*repeat this nearly verbatim for sampling: 

. We admit tlie existence of systematic error—of a difference between 


"the population sampled (the sampled population) and the population 


of interest (the target population). We ask the observations about the 
sampled pofulation. We ask our subject matter knowledge, intuition, 
and general information about the relation’ between sampled popula- 
tion and target population. 

Notice that the measured quantity is not the raw readings, which 
usually define a different measured quantity, buf rather the adjusted 
values resulting from all the standard corrections appropriate to the 
method of measurement.' (Not the actudl gas*volume, but the gas 
volume at standard conditions!) Similarly, the result for the sampled 
population is not the raw mean of the observations, which usually de- 
fines a different sampled population, but rather the adjusted or 
weighted mean, all corrections, weightings and the like appropriate to 
the method of sampling having been appligd. Weighting a sample ap- 
propriately is no more fudging the data than is correcting a gas volume 
for barometric pressure. % . o х 

The third great virtue of probability sampling is the relative definite- 
ness of the sampled popplation. It is°uswally possible to point the 
finger at most of the groups in the target population who have no chance 


„Ко enter Ше sample, who therefore were not in the sampled population; 


and to point the finger at many of the groups whose chance of enterjng 


‘the sample was dess than or ‘more than the chance allotted to them in 


the computation, who therefore were fractionally or multiply repre- 
sented in the sampled population. Whefi а nonprÜbability sample is 
adjusted and weighted to the best ofan experts ability, on the other 
hand, it may still be very difficult to say what the sampled population 
really is. (Selectivity within segments cannot be allowed for by weights 
or adjustments, but 16 arises to some exteht in every nonfrobability 


sample and alters the*samplefl population.) • 
* 


12. The valut and conditions of adjustment 2 
e 
Some would say that correcting, adjusting and weighting most non- * 
Probability samples is a waste of time, since you do not know, when 
this process has’ beer completed, te what sampled population the 5 


e é ие 5 e 
. Д. "d 


34 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 х 


adjusted result refers. This is entirely equivalent to saying that it 
does not pay to adjust the result of a physical measurement for а 
^ known systematic error because there are, undoubtedly, other system- 
atic errors and some of them are likely to be in the other direction. 
Let us inquire into good practice in the measurement situation, and 
see What guidance it gives us for the sampling situation. 

When will the physicist adjust the principle for the known system- 
atic error? When (i) he has the necessary information and (ii) the 


adjustment is likely to help. The necessary information includes а. 


theory or empirical formula, and the necessary observations. Empirical 
formulas and observations are subject to fluctuations, so thet adjust- 
ment will usually change the magnitude of fluctuations as well as alter- 
ing the systematic error. The adjustment is likely to help unless the 
supposed reduction of systematic error coincides with a substantial 
increase in fluctuatioys. $ 

Tf the known systematic error is so small as not to 

(1) affect the result by a meaningful amount, or 

(2) affect the result by an amount likely to be as large as, or a sub- 

stantial fraction of, the unknown systematic errors, 


then the physicist will report either the adjusted or the unadjusted + 


value. If he reports the unadjusted value, he should state that the 

» adjustment has beer» examined, апд із less than such-and-so. To, do 
this, either he must have caleulated the adjustment: or he must have 
had generally applisable and’ Strong evidence that it is small. 

In any event, his main care, which he will not always take, must be 
to warn the reader about«the dangers of further systematic errors, 
perhaps, in some cases, even by saying bluntly ‘that “the adjusted 
value isn’t much better than the raw value,” and then proyide raw 
values for those who wish to adjust their own. 3 

If the physicist is aware of systematic’ errors of serious maghitude 
and has no basis for adjustment, his practice is to name the measured 
quantity something, like Brihnell hardness, Charpy impact strength, 
or if he is a chemist—iodine value, heavy metals as Pb, etc. By analogy, 
those who feel that the combination of recall.and interview technique 
make Kinsey’s results subject to great systematic error might well 
define “KÈM sexual behavior” as a standard term,” and work with this. 

By analogy then, when should a nonpfobability sample be adjusted 

_ in principle’ ? (Most probability samples are made to be weighted any- 
| ,way—this is part of the design and must be carried out.) When (i) 


$, g 
3 The letters KPM stand for Kinsey, Pomeroy and Martin, the authors of Sexual Behavior in the 
Human Male, ^ 


ro 
- E J 


Non 
S 


k 
A 


PRINCIPLES OF SAMPLING e 35 


we have the necessary information and (ii) when the adjustment, is 

likely to help. The necessary information will usually consist of facts 

or estimgtes of ue true fractions in the population of the various seg- 
ments. 

When is the мао likely to*help? This problem has usually 
been a ticklish point requiring technical knowledge and intuition. A 
quantitative solution is now given in Chapter V-C of Appendix C in 
the complete report. With this as a guide, it should be possible to make _ 

‚ reasqnable decisions about the helpfulness of adjustment. 

If the detision is to adjust, we should accept the sampled population 
corresponding to the adjusted mean, and calculate the adjustment. 
We then report the adjusted values unless the adjustment is small, 


* * when we may report the unadjusted, value with the statement that 


the adjustment alters it by less than such-and-so. 

Our main care, which we mày not always take, must be to warn the 
reader about the dangers ef further lack of representativeness, perhaps, 
in some cases, even by shying bluntly that “tHe adjusted mean isn’t 
much better than the raw mean, even if we took 20 pages to tell you 
how we did it and six months to do it,” and to provide raw means for 
those who wish to adjust their own. 

If we were prepared to report an unadjusted mean, we were clearly 
inyiting inferenae to some sampled populgtion.eAdjustment will give. 
usa sampled population that is usually nearer to the target population. 
Hence we should adjust. . o 

If we cannot adjust, and must pant raw data which we feel 
badly needs adjustment,.we may вау that this is what we found in 
these pases—take “em or leave 'em. Except front the point of view ‘of 

- protecting the reader from over-belief in the results, this would sgem 
to be a cdunsel of despair. By anglogy with the physicist, it seems bejiter 
` to introduce “ePM.sexual behavior” and its analogs in such situations. 


T . e s 
e 
e 
ee e « 
E: bd cf E d 
е 
б 
e ^ е 
е 
== Е 
е е 
e 
е е 


» work ir?selecting and holding patients to observation, hı 


* DO PERSONS LOST TO LONG TERM OBSERVATION 
HAVE THE SAME EXPERIENCE AS 
PERSONS, OBSERVED? 


EVALUATION OF ANTISYPNILITIC THERAPY* 


Tnuxoponz J. Bauer, James Е. DoNoHUE;, VINCENT LARSEN, 
ALBERT P. IsknANT, AND QUENTIN R. Remeint 
> 


EDICAL research has long been faced with the problem of patients 
M lapsing from observation during the evaluation of tréatment. To 
date methods of analysis of results of therapy have assumed that the 
patients who lapse from observation would have had the same experi- 
ence as those who remained under observations To the extent that this 
assumption is not conrect the calculation of the results of therapy is in 
error. It has been claimed that the “observed” is weighted in favor of 
failures as relapses would return for more tfeatment while those who 
are getting along all right would not bother to return for posttreat- 
ment observation. On the otherhand it has been stated that the failures 
go elsewhere for retreatment instead of to the source of the original 
treatment and thus bias a study in.favor of “good” results. 
> In an effort to test the hypothesis underlying current evaluation, of 
treatment (viz., the “not observed” would have had-the same experi- 
ence,as the observed) a spscial study in the evaluation of therapy for 
syphilis, the Blue Star Research Study, was initiated by the Division of 
Venereal Disease, U. S. Pubtic Health Service, in which an attempt was 
made to hold a group of patients to 100 per cent follow-up and com- 
pare the results of therapy with those obtained when no «intensive 
follow-up effort was made [1]. These patients included 560 persons 
with secondary syphilis confirmed by datkfield examination wHo had 
had no previous antisyphilitie therapy of any kind. The follow-up ef- 
fort was over 90 per cent elféctive over а period of two years, largely 
due to the work of the physiciafs in the cooperating treatment facili- 
ties and specially trained research investigators whose sole functions 


* Presentéd before the session on Statistical Evaluation of Clinical*Data, American Statistical 
Association Annual Meeting, December 28, 1951. 

t Dr, Theodore J. Bauer formenly Chieft-Division of VEnereal Disease, U.S. Public Health Service, 
represents the physicians and nurses who, over the years, devoted their diligent efforts to the treatment 
and observation of Blue Star Research Study patients. Mr. Donohue, Principal Statistician, and Mr. 

mareen, Health Program Representative, represent the research investigators “hone tireless and careful 


` the time of selection. 


PERSONS LOST TO LONG TERM OBSERVATION 37 


have Seen to assist the physicians in selecting patients and to see that 
these patients are held to follow-up observation. 

The determination of response to therapy in syphilis requires ob- ' 
servatión of the patient for a long period of time following treatment. 
In the primary or secondary stage, *relapse following treatment gen- 
erally occurs within two yearé, if at all, although it may take as long 
as twenty years or more to determine whether late, disabling effects 
occur (sugh ds paresis, tabes dorsalis, and aneurysm). This study is 


* limited to a tworyear evaluation of response to therapy for secondary 


syphilis. The presente of relapse is determined by physical examination, 
darkfield microscop% of sera from lesions, and blood and spinal fluid 
serologic tests. Results of the blood éest are usually reported quantita- 
tively in titer units based on successive twofold dilutions of the pa- 
tient’s serum. When treatment is successful, the titer gradually de- 
clines until negativity is reached a number of months after treatment. 
After the initial schedule of treatment is qompleted, no further therapy 
is administered unless lesions reappear, Unless & serologic relapse evi- 
denced by a sustained rise in serologic titer occurs, or а high titer (32 
Kahn units or more) is sustained for ene year or more following ther- 
apy [1]. Death from syphilis rarely occurs in the first few years after 
onset. е . 

eThe criteria (other than diagnostic) taken into consideration in the, 
selection of patients for the Blue Star Research Study are willingness of 
the patient to cooperate in long term follow-up, goed general health, 
residential stability, and, in'a measure, personal stability (i.e., alco- 
holies, drug addicts, ete. were excluded if known). It was felt that these 
characteristics would not affect the relapse rate to an appreciable éx- 


P. +. 
‚ tent. Ingspite of screening, however, some such unstable persons were 


included in the study because their characteristics were not known at 
„Л 1 


In the following pages the experience of these intensively followed 
patients is used in two different methóds to test*the validity of the 
hypothesis that patients who lapse from observation would have had 
the same experience as those who remained under observation, first, by 
analysis of the results for patients within the special study and second, 
by comparison of results of special study patients with results for an- 


other group of patierits who were follezed less intensively. 
e 


METHOD I x 
: E 
This analysis is limited +6 patients treated for secondary Syphilis e 
who were selected for the special intensive follow-up study. Three 
e e 


s e 
ЗУ; é * x á et e У; 


с 


5 


38 . AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


schedules of treatment, all utilizing crystalline penicillin G are included, 

1. Atotal of 2,800,000 units of aqueous penicillin alone—25,000 units 
administered every three hours for fourteen days. 

2. A total of 2,800,000 units of aqueous penicillin with conjunctive 
arsenoxide and bismuth—25,000 &nits of penicillin administered 
every three hours for 14 days andia total of 4 to 6 thg. of arsen- 
oxide per kilogram of body weight (or a total of 300 mg. in persons 
weighing over 60 kg.) and 600 mg. of bismuth. — ' ‘ 

3. A total of 3,400,000 units of aqueous penicillin~-40,000 units ad- 
ministered every two hours for seven days. ' i 

Almost all of the patients treated on these schedules have had the op- 
portunity to be observed for at least two years, and sufficient cases for 


evaluation have been aceumulated. Patients were observed monthly for ' 


the first year and quarterly during the second ‘year, 

Follow-up of the patients was secured by the research investigator 
stationed at each cooperating facility. A variety of follow-up tech- 
niques was used by each investigator including letters, telegrams, tele- 
phone calls, visits to patients, etc. The cooperation given by public 


` clinics and physicians in many parts of the country made follow-up 
continuity possible for many patients who moved beyond commuting  ' 


distance from the treating facility. Not all patients proved to be co- 
»operative. These cases would have been lost to follew-up had it not 
been for the diligence of the investigators. The success of follow-up is 


Y 


indicated in Table 1 where it can be seen that approximately 95 per 


ABLE 1 - 
SPECIAL STUDY IN TREATMENT OF SECONDARY SYPHILIS 
A POSTTREATMENT OBSERVATION SUCCESS, BY 
REATMENT SCHEDULE 


S 2,800,000 unita — 2,800 000 units 3,400 000 units 
Penicilin Alone Pen, Ars. 9 ВЫ Penicillin Alone тва 
SERIES Maro | баатыр 5 
Per Per Per er 
n Number | Cent — Number Gent — Numter Cent Number ent 
Under Observation 160 958 179 92[7 147 
ў .5 486 94.6 
Retreated before 2 years 30 29 16 А 75 
2 years or more since treatment 121 149 124 394 
Less than 2 years since treatment 9 1 7 17 
Е Observatiün JO) om M Meise) 45 5.4 
арвей 8 13 7 26 
Died (not assoc. with syphilis) э 1 a P 0 2 
‘Total number treated 107. 100.0 19 100.0 154 100.0 54 100.0 


2 D è D ў 
cent of the patients under study were observed for two years or until 
retreatment, or are still under observation at this writing (about 4 


>. эъ 


07 


PERSONS LOST TO LONG TERM OBSERVA'BION t 39 


years Since the study began). No significant differences among the 
three schedules in the percentage of patients under observation were 
noted. (The probability of getting differences giving a chi-square 
greater than that obtained through chance alone is slightly more than 
6 out of 10.) А is ) 

In consult&tion*with researcl£investigators using detailed records on 
interviews*with patients and on the amount of effort required to keep 
„patients unger’ observation, the patients have been classified according 
tò whether or not*they would probably have been lost to routine meth- 
ods df follow-up. Patftents who missed two or more appointments with- 
out a goed feason; thpse who frequently required special attention in 
obtaining observations such'is home wisits, provision for transportation 
to the facility even when no observations were actually missed; and 
those who moved out of the area served by the facility—all these were 
considered as becoming lost tô routine follow-upeduring the two year 
observation period. For each patient who would have been lost to 
routine follow-up, the posttreatment observatioh period in which he 
would have lapsed was estimated as closely as possible. 

It was our original intention to divide the research cases into two 
groups, the cooperative and the uncooperative, and compare the cumu- 
lative retreatment rates in both groups. Unfortunately there was no 
way of determining which patients would, be cooperative and which 
uncooperative extept by observing the patient’s behavior. Obviously 
the longer the time over which the patient was observed the greater 
the opportunity for showing ‘uncooperativeness. Therefore, those pa- 
tients who were retreated, in the early posttreatment months did not 
have a chance to begome uncooperative ‘and hente in the early months 


the presumably cooperative patients showed higher retreatment rates. 


We thtrefore decided to mak@ ovr comparison on a more realistic 
basis by comparing «ће resúlts of all patients in the study with the 
results that would have been obtained with the same cases if they had 
not had this concentrated follow-up. This would in'essence amount to 
comparing the retreatment rate of patfents without intensive follow-up 
to the retreatment rate of the same patients with intensive follow-up 
and would resemble the findings in actual practice. 

The method of calculation of the retreatment rates is that used by 
the Division of Veneréal Disefse and prsviously described by several of 
us [3, 4]. Briefly, in this method the retreatment rate is calculated by 
making appropriate adjustment for the loss of patients from observa-, 


tion. The total used for confputing retreatment, seropositivity, and *. 


Seronegativity rates is adjusted by including the same proportion of 
e е 


- s e 
e 
t é ГТ . on 


© 


2 
2 


40 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


retreated as non-retreated patients remaining under observation. Then 
if a, persons are observed in period n or later of whom с, persons are 
retreated in period n, denoting the adjusted total cases by ez, 

x J 


ә Anli) 
Ba a А 
Gn—1) С) 


The retreatment rate in period n is 100Xc,/en, and the cumulative 


retreatment rate is the sum of the retreatment rates for all individual 
observation periods through n. The per cent seropositive or seronega- 
tive in period л is simply the number seropositive or seronegative di- 
vided by the adjusted total caseg times 100. 


"The retreatment rate obtained by this method is the same as that ob- . 


tained by the method commonly referred to as, “the life table method.” 
"The seronegativity and seropositivity rates differ somewhat since “the 
life table method” cumulates sustained seronegativity rates whereas 
the method described here*computes the seronegativity rate for each 
particular period based only on cases observed in that period or later. 

The following two groups were tabulated and analyzed separately: 


A—AIl patients in the study (i.e., intensive follow-up)—analysis of а 


the results of therapy including all posttreatment: observations 
through two years of follow-up on patients in the study. 
B--Same patients with “routine” follow-up—analysis óf the results 
, of therapy for the same patients including only those observa- 
tions on the uncooperative patiehts prior to the posttreatment 
period in which they, would most likely havé lapsed. 
' A couple of examples sHould help make clear*precisely which ob- 


servations were included in group A and which, in group R. Patient, 


LM. was treated for secondary syphilis and observed for six months. 
At the end of this time he moved to another’ city,500 miles away from 
the treatment center. Arrangements were made for the patient to be 
observed by the l6cal clinic ih the new city of residence, and follow-up 
continued for the complete twovyear period. In group A the results of 
all observations on this patient are included for the two years. In 
group B observations on this patient are included for only six months; 
that is, ОЙЫ Һе moved. ^ T? 

Or take another case,E. Go. who was treated for secondary syphilis 
and was observed for 9 months following treatment. He did not appeat 
for his tenth month examination. When the investigator found the pa- 
tient and interviewed him, the patient indicated that he believed him- 
self to be cured and had decided to stop«coming in for examinations. 


> 


2 5 » * 
Э E » 


PERSONS LOST TO LONG TERM OBSERVATION 41 


The patient did, however, agree to accompany the investigator to the 
clinic for an examination and serologic test. For the remainder of the 
two years, the investigator had to find the patient and bring him to the 
clinic for each examination. In | group A all observations of this patient 
are included for the two years. "Та group B only the ee for 
nine months *re ihcluded. £ 
Tables 2 3, and 4 and Figure 1 show by treatment schedule the com- 
parative dus 'on the results of therapy for the same patients for all 
. obseryations and for those which would ordinarily be made without in- 
tensive, follow-w up. The greatest difference between the cumulative re- 
treatmené rftes for bgth groups amounted to only 1.6 per cent at the 
seventh month on the 3,400,000 unitsef penicillin schedule, At no point 
* are there significant differences! 1 in rates between the two follow-up 
conditions (at the level of P=.95). At 21-24 months one schedule 
shows no difference in retreatmtnt rates, and the others show differences 
in opposite directions. The largest differenge was 1.1% in the 3,400,000 
unit schedule. The probability of getting larger" differences than this 
by chance alone is nearly one half. 


1Tn testing for significance the intensively followed group has been considered the population con- 
sisting of N treatetl patients. In the ith observation, period since treatment, s; patients were retreated 
representing pi. proportion retreated. In the sample obtained through, routine follow-up, primes are 
used®to denote the estinfatgs. To determine whether the cunfülative proportion retreated over r ob- 
servation periods in the Sample differs significantly from ths cumulative of the parameter over these 
same intervals, the following formulas are applied: 


e "7 e ° 
z z 
t Xaa- Ewy -a 
4»7 ^/ ia {чө е . . 
° e K 
a EW -1) 


E 
Where E, theseffective sample size as adapted fom Cornfield (2), is defined as the equivalent. sample 
size in which there аге no losses тоте obagrvatin but which yields the same proportion retreated and 
the same number of retfeated Gases ав were observed; i.e., 


o т e е 
Es 
ili ө 
E= 
e т. 
Ун. 
ое iat e ө 
The t-test is then applied as follows: е P" 
т. т, » 
. Xn: - Dn’ 
1 fal e D д 
te 
вр, 
. 
q 
e . 


42 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 
TABLE 2 й 


COMPARISON OF RESULTS OF THERAPY EVALUATION FOR 
INTENSIVE AND “ROUTINE” FOLLOW-UP 


SECONDARY SYPHILIS—CUMULATIVE RETREATMENT' RATES 


TREATMENT SCHEDULE; Peniciin—2,800,000 units—25,000 
every 3 hours ^Crystalline G" (14 days) No arsenoxide, no bismuth 


A—Intensive Follow-up 


Not re-treated 
Observa- Retreated Cases ae e 
tion Beropositive , Beronegative Ае 
Period осе ——— тко 
Cases* 
(months) amber Per Cent xev Number PerCent Number Per Ceft 
$e o 
A p = — 166 99.4 1 0.6 107 
1-2 2 1.2 1.2 144 86.2 21 12.6 167 
= — 1.2 118 ИЗҮ? 46 27.7 166 
3 1.8 3.0 87 Ч* 52.4 74 44.6 106 
1 0.6 8.6 64 38.6 96 57.8 166 
5 3.0 og 50 30.1 105 63.3 166 
3 тү ЖЕҢЕКЕ 40 24.8 1 67.3 165 
1 0.6 9.0 34 20.6 1167 70.3 165 
2 1.2 10.2 30 18.2 118 71.6 105 
2 1.2 M4 4 27 14.8 120 73.8 163 
8 1.8 13.2 21 12.9 120 73.8 163 
S06 18.8 18 11.1 122 75.0 168 k 
8 1.8 15.6 13 8.0 124 76.2 168 
Š 2 1.2 16.8 8 5.0 126 78.0 162 
les 2 1.8" 18a 9 5.7 119: 76.0 107 
: = m ‚18.1 8 5.4 113. 76.3 148 
тт a al E e) 
B—Rovutine Follow-up? ° a 
» 
m 2» Not re-treatod 
JObserva- Кекем! Come d j 
Period 9 Beropositive Я Seronegative Spi б 
onthe) Cumula- i 
i Number Per Cent tive % Boro PerCent Number 
Sage a RO а ы ОСр 
E — — — 164 99.4 19 ов 105 
1-2 2 1.8 1.8 135 85.4 21 13.8 158 
2-8 — o 1.32 109 270.3 мл 28.4 155 
3-4 8 2.0 3.8 79 53.1 65 43.7 149 
45 1 0.7 40. о 59 40,2 82 55.9 147 
5-6 5 3.5 7.5 43 30.1 89 62.4 143 
6-7 2 1.4 8.9 35 24.9" 93 66.2 140 
7-8 1 0.7 9.6 29 ~+ 20.8 97 69.6 + 139 
8-9 ж 2 1.4 1:0 26 18.8 2 97 70.1 138 
9-10 E 1.5 12.5 20 14.8 98 72.6 135 
10-11 2 1.5 214.0 У 17 718.0 95 72.9 130 
1142 — E 14.0 15 11.6 96 74.3 129 
12-15 2 1.7 15.7 10 8.3 91 75.9 120 
> К 
, 2-18 1.7 17.4 6 5.2 90 77.4 116. 
» 18-21 1 0.9 18.3 quide 85 75.4 118 
21-94 — — = 18.3 6270 ; 
T 5.6 Sho 7610 - 107 Se 


* Adjusted total eases in each period i e TUE А 
and cases retreated in previous periods adjuste for Loses noe о саза обсте in this period ог later 


» 
2 ay 2 " 


PERSONS LOST TO LONG TERM OBSERVATION 43 


D TABLE 3 Т 


COMPARISON OF RESULTS OF THERAPY EVALUATION FOR 
. INTENSIVE AND *ROUTINE" FOLLOW-UP 


SECONDARY SYPHILIS—CUMULATIVE RETREATMENT RATES 


TREATMENT SCHEDULE: Penicillin—2,800,000 units—25,000 

every 3 hours “Crystalline G” 4-6 mg. arsenoxide per kg. of body 

weight (er total of 300 mg. реген weighing over 60 kg.) апа 600 
> mg. of bismuth (14 BA x 


A—Intensive Follow-up 


ee S 
° 


" id : ce Not re-treated 
Observa- Retreated’ Cases 
tin о ө Seropositive Seronegative Hines 
Period e —- oe 
(months) Number Per Cent CUE Number Рег Cent Number Per Cent 
е п tive % 
-1 — — e 192 99.5 Y 0.5 
1-2 — = — & їз 92.7 gi 7.8 
2-8 1 0.5 0.5 140 78.7 49 25.8 
3-4 2 1.1 т.6 108 8 79 41.6 
4-5 2 1.1 2.7 79 es © 105 55.6 
5-6 8 1.6 4.8 61 32.3 120 63.5 
6-7 4 2.1 6.4 48 25.5 128 68.1 
7-8 = i 6.4 42 , 22.8 184 71.8 
8-9 2 14 7.5 36 19.2 138 73.4 
9-10 1 0.5 8.0 35 18.8 136 18.2 
10-11 b 0.5 8.5 30 16.2 139 75.3 
11-12 3 1.6 10.1 26 14.1 140 75.8 
3-15 { 02.2 12.8 20 109 өш 16.8 
15-18 43 18.4 15 8.8 141 78.3 
18-21 3 1.7 15.1 9 „5.0 мз 79.9 
21-24 1 0.6 15.7 6 Зе м2 • 80.9 
B—Routine Follow-up е 9.7 
берк се duci: 
о = m Not re-retaeated s. 
bserva- Retreate® Cases 
* tion 4 " Beropositive Beronegative A 
Period e——— E 
H Cases 
808и) Number Ber Cept Comia Number PerCent Number Per Cent 
E! = — EI wer 5 0.5 188 
1-2 = — — 161 AE 5 8.5 176 
2-8 1 0.6 0.6 122 в 72.6 45 20.8 108 
3-4 2 1.3 1.9 85 53.8 70 44.3 158 
4-5 2 1.4 ©3.3 57 39.4 83 57.4 145 
5-6, 2 1.4 PU Ap. 29.2 93 66.2 141 
6-7 8 226 6.9 31 229 95 70.8 185 
ie = — 6.9 29 221 9» — 710 131 
8-9 1 0.8 * 7790 21. (36.9. 94 15.5 125 
reed — 7.7 т 17.0 93 75.8 123 
х-и 17598 8.5 18 14.7 94 — 76.8  *1m 
11-12 3 2.5 11.0 14 11.9 91 77.1 118 
12-15 2 1.8 12.8 10 9.0 © 87 78.2 elll е 
15-18 2. 19 мл ° 6 5.7 79.5 104 
18-21 — — 14.2 4 0 81 81.2 100 
21-24 i1 fo 15.7 1 .0 82 83.2 99 


um 
| 
| 

| 

Е 


* Adjusted in each Beri 
and cases usted total cases іп e ach Beri 


44 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


TABLE 4 1 


COMPARISON ОЕ RESULTS OF THERAPY EVALUATION FOR 
INTENSIVE AND “ROUTINE” FOLLOW-UP 


SECONDARY SYPHILIS—CUMULATIVE RETREATMENT ‘RATES 


TREATMENT SCHEDULE; Penicjllin—3,400,000 units—40,000 
every 2 hours “Crystalline G” 7 days), No arsenoxide, no bismuth 


A—Intensive Follow-up \ 
Not re-treated 
Observa- Retreated Cases = t 
tion Beropositive Seronegative Ad: 
E күр tal 
Cases* 
(months) Number Per Cent Mung, Number Per Cent Number Per'Cens 
3 
4 ———————————————————D 
-1 = — – 153 99.4 1 0.6 154 
1-2 1 0.7 0.7 145 94.8 7 4.6 158 
2-3 — — 0.7 122 79.7 30 19.6 153 
34 3 2.0 2.7 98 æ 64.1 51 33.3 153 
4-5 $1599 лат 79 51.6 67 43.8 153 
5-6 — — 4.7, 71 46.4 75 49.0 153 
6-7 4 260 7.8 59 38.6 83 54.2 158 
7-8 1 0.7 8.0 51 33.8 90 58.8 153 
8-9 2 1.8 9.8 39 25.7 99 65.2 152 
9-10 E = эз . 82 21.1 106 69.8 152 
10-11 = - 9.8 29 19.2 108 71.6 151 
11-12 T = 9.8 24 15.9 118 74.9 151 
| 12-15 1 0.7 10.0 19 12.8 115 77.4 149 
15-18 — — 10.0 19 12.8 115 77.4 149 
» - 18-21 1 0.79 10.3 14 9.5 118 80.0 143 
21-28 = - 10.7 9 6.5 1:5 83.0 139 


шл з O 
B—Rantine Follow-up» o v 


IR E E е MU _— А 


o Not re-treated 
места: Retreated Cases ^ т Adjusted 
Аа 9 s Seropositive "Seronegative ACE 
(months) Canals. ES NES 
Number Per Cent tive % sn PerCent Number Per Cent 
[AP Re ылы аы тоиы у  є° RON 
-1 E — = ит 9.3 19» 07 148 
га 1 0.7 0.7 133 ` 95.0 6 4.3 140 
2 E Bh 0.7 o 103  ,78.7 27 20.6 131 
à 2.8 78 63.5 42 34.2 123 
£s 2 1.7 4.0 » 61 51.4 53 44.6 119 
de m 5 4.0 55 47.1 57 48.8 117 
er 2 s 5.7 47 41.0 61 53.2 115 
= 0.9 6.6 39 35.0 65 58.3 n 
EE 1.9 8,5 29 276 p 97 63.8 105 
уа Е = 8.5 24 23.1 71 68.3 104 
IH uU = „8-5: 20 919.9 72 71.6 101 
ine 25 = 8.5 14 14.1 77 77.4 100 
ae = T 8.5 12 12.6 15 18.8 95 
— 8.5 12 12.8 74 78.7 94 
,9 B4 1 1.10 9.6 8 8.6 76 81.7 93 
21-24 aa = 9.6 5 95.6 75 ‚84.7 89 
* Adjust " тлар 7 
end ct roseata spose Lati sen hein thi period ate 


LJ 
т so Š 
p 27 


PERSONS LOST TO LONG TERM OBSERVATION 45 


Schedule’ 2, 800,000u Aqueous Crystalline Penicillin G(25,000 q 3 hrs) 
20 


16 * 
е . 
512 Routine follow al^ 
28 
* 
4 
0 


0 3 6 9 


12 15 18 21 24 
Posttreatment observation (months) P 
° 


Schedule: 2,800,000u Aqueous Crystalline Penicillin 6 (25.000 q 3 hrs) 


16 plus Arsenoxide and Bismuth 
° 
12 Routine follow up 
2 
$ 
ов 
% 
A 
4» 
0 
0 3 6 9e 12 15 18 21 24 
e Posttreatment obseryation (months) 
е 
. € ?, е e 
Schedule: 3,400,000u Aqueous Crystalline Penicillin б (40,000 q 2 hrs) 
+12 L Р) 
#8 
o 
о 
H 
© 
n4 
0 
025 3 6 9 12 15 18 21 24 
e Rosttreatment observatiqn (months) е 


Fig. 1. Results ‘of therdpy evalfation—inteiistve end routine follow-up—sec- 


ondary syphilis—cumulative retreatment rates. e 
. 
e e ^ 
, ° 
. 
= 
. E 
° bd 


46 à AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954. 


If the intensively followed group is taken as the population arid the 
routinely followed group as a sample, from that population, the re- 
mainder of the population consists of those patients who wquld have 
been lost to observation. Using the “effective-sample-size” notion [2], 
this situation closely approximates a condition where two samples are 
drawn from a population without replacement so that т.л, = N. It 
can be readily demonstrated that, where n;--n,— N, 

bg a Xe Xs. -X 2 de - xv E 


^ 


i 
ГЕТА Oty 021, E 
that is, the t-value of the difference between two sample means where 
тт = № is the same as the t-value of the difference between either 
sample mean and the parameter. Applied to the present problem, this 
indicates that there are no significant, differences between the results 
for persons remaining under observation and for those lost to observa- 
tion since no significant differences were noted between the routinely 
followed and the intensively followed patients, 
Table 5 shows the percentage of total cases followed for two years, 
the lowest limit being about 50 per cent followed for two years, Then 
with at least 50 per cent follow-up, patients lost to observation in the 


3» 


ence as patients remaining’ under observation. 


, evaluation of treatment for syphilis, very likely have the same experi- 
ГА 


» TABLE 5 


š o o» 
POSTTREATMENT OBSERVATION “UNDER INTENSIVE AND 
ROUTINE CONDITIONS OF FOLLGW-UP 


Special Study in Treatment of Secondary.Syphilis 
ee eol en ee ondary 


» A—Intensive Follow-up B—Routine'ollow-up 


3 

Treatment Schedule Total Followed 21-24 ros. Total Followed 21-24 mos. 
сев ——— cases 

treated Number PerCent treated Number Per Cent 


ee 


DiS SESE a Se) SS ea ТЫШ 
2,800,000 units Penicillin Alone 167 148 88.6 167 107 etl 
2,800,000 units Pen, with Ars. & Bis. 193 » 175 90.7 193 99 51.3 
8,400,000 units Penicillin Alone 154 139 90.3 154 89 57.8 
A 
METHOD 2 ES 


In this method we kave*tompared ‘the results of treatment with 


D 25 


) 
1 


PERSONS LOST TO LONG TERM OBSERVATION 47 


groups те mutually exclusive. The Division of Venereal Disease re- 
ceives records from many hospitals and clinics on treatment and fol- 
low-up ofpatients treated for early syphilis. Many schedules of treat- 
ment are included ifl these records which are analyzed periodically for 
comparative evaluation. The resülts of these evaluations are published 
by the Division for*program guidance. The follow-up on these patients 
is not as infensive as in the special study. All patients are advised of 
the importance of posttreatment follow-up before they are discharged: 
‘from the treating facility. Form letters are sent to all patients remind- 
ing them when examirlations are due, and occasionally, visits are made 
to the patients by a representative of the health department. In this 
routine method follow-up is conducted*entirely on an impersonal basis, 
*and there are many lapses from observation. In order to test the valid- 
ity of a statistical evaluation with incomplete follow-up, a group of pa- 
tients treated with crystalline penicillin G from #6 “routine” evalua- 
tion was compared with the patients treated with crystalline penicillin 
Gin the intensive follow-up group. Both groups of patients were treated- 
in the same clinics and during the same time periods (July 1946- 
December 1948). jf 
Among patients with intensive follow-up the amount of penicillin 
ranged from ‘2,800,000 units to 4,200,000 units for an average of 
3,130,000 units. Among the patients with reutine*follow-up total dos- 
age ranged from 2,400,000 units to 4,800,000 units for an average of 
3,369,000 units.? Available evidence indicates that a difference of 
239,000 units in an average penicillin dosage of over 3,000,000 units 
would make very little difference inthe yetfeatment rates in the two, 
groups inasmuch as tke dosage-response curve ha§ very little slope at 
3,000,000 “nits and above. Therefore, the retreatment rates for tke 
two groupè can be expected to be‘approximately equal. ^ 
The intensive follow-up group includes 253 cases with 92.1 per cent 
observed for two years (or untilyetreated). The routine follow-up group 
had 1,864 patients treated of which 41.7 per cent were observed for two 
years (or until retreated). Results of therapy through 24 months are 
Presented in Table 6. At the 24th month the cumulative retreatment 
tates are practically identical. Figure 2 presents a graphic comparison 
of the results. It can be concluded that the intensively followed cases 
Were retreated earlier than those routinely fotowed, as evidenced by 
the slightly higher retreatment rate in the intensively followed gfoup 
throughout the first year of observation. Also cases with intengive 
EA Баана 
* The geometric mean was used in this calculation because the dosage-response curve is such that 


Brithmetio changes in the re£reatment raté are associated with geometric changes in dosage. 


e . 
^ 


= e $? в .. е - 


48 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


the first six months than did cases routinely followed. Differences are 
negligible after the first six months of posttreatment observgtion. The 
fact that intensively followed cases were observed more frequently 


100 100 


#++ + + * Intensive follow up 
80 — Routine follow up Р 80 


follow-up are recorded as having reversed to negative more rapidly in 
AEE Overlap 


4SERONEGATIVE ` 


60 60 


Percent 
1u22x2dp 


40 40 


20 SEROPOSITIVE Lá 20 
n 


RE- TREATED 


0^ 
Adm. 


0 


6 9 32777118 18 21 24 
Posttreatrilent observation (months) 


3 


5 ) 253 & 
Fic. 2. Comparison of results of treatment with crystalline penicillin G 


of secondary syphilis in a group with intensive follow-up and in'a group with 
Toutine follow-up. ? Я 


a 
» o 


than those under routine follow-up would account for the earlier detec- 
tion of cases requiring retreatment anong those intensively followed. 
This would probably also account for differences in the rapidity of re- 
versal to seronegative, In spite of earlier retreatment and seronegativ- 
ity rates, however the cumulative results from the 12th month on are 
almost identical for both groups. Do 

1 


CONCLUSION AND DISCUSSION 


» 
Two methods have been used to present evidence :oncerning the 
assumption that persons lost to obseryation have the same experience 
as persons observed in the evaluation of treatment for syphilis. In the 


5 
» 


к > 
= ra > > 


PERSONS LOST TO LONG TERM OBSERVATION 49 


é TABLE 6 4 


RESULTS OF TREATMENT OF SECONDARY SYPHILIS WITH 
CRYSTALLINE PENICILLIN G IN A GROUP WITH IN- 
ylENSIVE FOLLOW-UP AND IN A¥GROUP WITH 

e “ROUTINE” FOLLOW-UP 


Intensive Follow-up Group 


e Not re-treated 
Observa- Retreated Cases . 
etion Seropositive Seronegative Adjusted 
. Period - — Total 
(montis). umber Per Cent, Cume Number Per Cent Number Per Cent См 
pom 
a *1 (04 € 04 20 98.8, 2 ов 253 
1-2 8725991: 16° 224 e 89.6 22 8.8 250 
2-8 = — 1.6 182 72.8 064 25.6 250 
3-4 2 0.8 2.4 187 55.0 100 42.6 249 
4-5 4 1.6 94.0 1l — 448 127 51.2 248 
5-6 6 2.4 баан 80) 172 8820 y 57.5 247 
6-7 5 2.0 8.4 70 30.9 49 60.6 246 
7-8 1 0.4 8.8 00 — 281 155 63.0 246 
8-9 5 2.1 10.9 54 2% 263 66.9 244 
Wig c = 10.9 46 19.1 1690 70.0 24 
10-11 1 0.4 11.8 40 16.6 m4 721 241 
11-12 1 0.4 11.7 34 14.1 179 74.2 241 
12-15 3 1.2 12.9 в — "10.8 183 76.2 240 
15-18 2 0.8 18.7 23 9.6 184 76.6 240 
18-21 ЗЕ а 15.0 19 7.9 14 77.0 239. 
21-24 — = 15.0 ue 4.7 187 80.2 288 \ 
< P 


Routine Follow-up Group, ® 


Notere-treateh е : ° 

Observa- Retreated Cases ө — 

3 tion е Seropositjve Seronegative Adjusted 
Period — ——ё— Total 
(months) umber Per Cent, Оша" Number Ре CSut Number Per Cent 0% * 

е tive% 

elon oad 0.1 0.1 19856 99.6 7 04 1,804" 
1-2 a 4 0.2 og 19765 96.4 60 3.8 1,830 © 
2-8 7 4.4 . 0.7 © 1,56. 85:2 254 14.2 1,792 
3-4 18 0.7 1.4 1,188 68.0 534 30.6 1,747 
4-5 21 1.2» 2.0 e 940 „5 710, 4.9 1,005 
5-6 37 2.2 4.8 667 5 soe  — 54.6 1,647 
vr 14 0.9 5.7 523 „ 32.9 977 61.4 1,592 
7-8 15 1.0 6.7 421 27.2 1,024 66.1 1,549 
8-9 18 1.2 *9 328 21.8 1,056 70.2 1,503 
9-0, 1з 0.9 8.8 277 18.9 1,000 72.3 1,466 
10-11 16 11. 6 9.9 238 16.7, , 1,040 78.3 ә 1,421 
1-12 17 1:2 11.1 205 15.0 1,012 73.8 1,870 
12-15 28 21 „132 ө 14 1.0 986 75.6 1,804 ij 
15-18 8 0.7 13.9 86 7.8 © 80 78.1 1,101 
18-21 8 0.9 14.8 54 5.7 746 79.3 940 
21-24 $9 r0 15.8 35 4.5 619 79.6 778 


on MR I ae egg uie ls е 
and adjusted total cases in each period incfides the number of cases observed in this period or later 
cases retreated in previous periods adjusted for losses from observation. 
>) 
гэ г: б . 


A 6 . . e 


50 ` AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 104 | 


first method, therapy results in a group of intensively followed patients 
were compared with the results which would have occurred among the 
same patients if intensive methods had not been used. In the second 
method therapy results in two mutually exclusive groups were com- 
pared. One group consisted of the previously mentioned patients who 
were followed intensively, and the other group солвіѕіва of routinely 
followed patients in the same treatment centers. No signiiicant differ- 
ences in retreatment rates were observed by either method, and it is 
our conclusion that in these series with at least 42' per cent complete. 
follow-up at two years after treatment patients iost to observation had 
the same experience as those who remained under obserVation. 

While these findings are of considerable value in the evaluation of 
therapy for syphilis, there is no evidence for their pplication to the 
study of therapy response in other diseases. Гог instance, where death 
from a disease frequently occurs after treatment, it is not at all likely 
that persons lost to observation would have the same experience as 
those observed. —" 


REFERENCES 


[1] Bauer, T. J., “Evaluation of antisyphilitic therapy with intensive follow-up, 
I. The plan," Journal of Venereal Disease Information, 32 (1951), 355-59. 

[2] Cornfield, Jerome in Cutler, Sidney J., “Cancer illness among residents in 
Atlanta, Georgia, 1947,” National Cancer Institute cf the Nutional Institutes 
of Health, Cancer Morbidity Series, No. 1, p. 30, 1950. 

[8] Iskrant, A. F:, Bowman, R. W., and Donohue, J. F., “Techniques in the eval- 
uation of antisyphilitic therapy,” Public Health Reports, 63 (1948), 965-77. 


[4] Iskrant, A. P., Remein,:Q. 'R., and Donohue, J. F., “Evaluation of anti- 


syphilitic therapy with inzeusive follow-up, III. Statistical method of analy- 


sis and its critical evaluation,” Journal of Venerecl Disease Fformation, 82 
' (1951), 371-75. з 1 
^» 


» 


9 
a 


D 
APPLICATIONS OF STATISTICAL METHODS TO 


SEDIMENTARY ROCKS* 


v e 
W. €. KRUMBEIN 


Northwestern. University 


Ы Statistica] methods find wide application in geology, es- 

E 1 “pecially in the study of textures and composition of sedimen- 

Я L4 {агу rocks. Certain apparent irregularities in the data, such as 
3 g . highly skewed distributions, use of weight instead of number 
К frequencies, use of unequal class intervals, and some others, 


requireg development of special methods of statistical analy- 

. sis. In part, l8garithmicetransformations permitted applica- 
tion of conventional methods to the data, Some sedimentary 
attributes,approach Gaussian distributions with no complicat- 
ing factors. Mineral composition data are commonly bi- 
nomial or Poisson distributions. Analysis of variance and ex- 
perimental design are becomipg increasingly important in 
further analysis of geological data, — * 


INTRODUCTION 97 


EVERAL circumstances controlled the development of statistical 

thinking i geology. The first fields opened to statistical analysis, 
somg 50 years aga, concerned data which did noteseem to lend them- 
selves to then current methods of analysis. Mineral composition of 
sediments, for example, yielded discrete rather than sontinuous gis- 
tributions; and size frequency distributions of sediments, although 
continuous, required use of unequal class Intervals and were commonly 
highly skewed. Morégyer, a single sand Sathple may contain millions 
of grains, у that weight percentage instead of number of grains was 


. more convenient for expressing frequency. Contemporary textbooks on 


statistids said little or. nothing about handling such irregular data. As 
a result, techniques adapted to these special needs were developed di- 
rectly by geologists. ? ы (m 
Histograms of sand analyses were us&d before 1900, and since about 
1925 logarithmic forms of histograms and cumulative curves have been 
in common use. The median and quartile deviation came into use about 
1930, and the median ‘is still the most popular average for Teporting 
Sedimentary grain size*data. Pn part, the"inertia of established pro- 
cedure and the large amount of data published as median and quartile 
summaries have operated against use of more efficient statistics in ex- 
* Paper presented'before the Chicago meeting of the American Statistical Association, December 28, 
1952, The manuscript has been partially revised to include developments during 1953. 
д 81 


d e e 2 ec e 


52 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH | 


pressing size analysis data. As long ago as 1935, Eisenhart [10bappli 
the Chi-square test to sampling problems in sedimentation, and num 
ous other workers, mentioned below, helped make available, meth: 
of wider and more general applicability than thoserinitially introduc 
Analysis of variance techniques have been used sporadically for about 
a decade, and in 1946 Swineford and Swineford [33] published a com: 
prehensive study based on a three factor analysis of variance model. 

Other fields of geology, notably paleontology and geomorphology, — 
not faced initially with discrete or logarithmic distributions, adopted. 
standard methods of analysis from the start. Applications in these field a 
have been expanded since the 1930's, and the process was accelerated. 
by publication of Simpson and Roe’s Quantitative Zoology in 1939 [82]; 
In 1948 and 1949 analysis of variance and multivariate analysis were“ | 
applied in paleontology by Burma [3] and Miller [22]. In 1950 Strahler 
[34] studied relations between samples dnd populations of surface slopes — 
in geomorphologic analysis 3 , 

Although modern statistical methods are being used to some extent 
in geology, the general state of statistical knowledge among geologists 
is rather unsatisfactory. Students seldom have courses in the subject, 
and some geology teachers perhaps tend to emphasize graphio pro- © 
cedures with little attention to underlying theory. It seems fair to | 
state, however, that»there,is an expanding interest in the subject and a 
corresponding increase of appreciation of what statistics can and cane ] 
not. do, а s 


PROPERTIES'OF SEDIMENTARY ROUKS 


Inasmuch as this paper"is addressed to an audience of statisticians, 
it. seems appropriate to define the scope of the present subject. Sedi- 
mentary rocks are deposits of soli materials on the earta’s surface 
produced by mechanical, chemical, or biological agencies in any me- 
dium (air, water, glacial ice) under normal conditions of the surface. 
All sedimentary rocks have attributes of composition, texture, ani 
structure. Composition refers tò mineralogical or chemical make-up of 
the rock; texture refers to characteristics of the grains or particles and. 
the grain-to-grain relations among them; and structure refers to larger 
features of the deposit, such as stratification, geometrical attitude of | 
the strata, and included-organic remains. А 1 

Textural and compositional properties of sediments have been | 
studied in more detail by statistical methods than have sedimentary 
structures, although many geological 0814 studies involve the statisti 
of structures, Grain orientation, directioas of cross-bedding, and atti 


„у 
) 


STATISTICAL METHODS APPLIED TO ROCKS? 53 


tude of (rock fractures have received some statistical treatment by 


Reiche [30]; Chayes [5]; and Pincus [28]. 

Inasmügh as textural and compositional features lend themselves 
well to laboratory st&idy, a vast amount of statistical data has accumu- 
lated on these properties. Table 4 defittes some textural and composi- 
tional properties of sediments, indicates those which have been quanti- 
fied, and suggests the nature of the distributions obtained in each, 

‚ Most work has been done on particle size distribution and mineral com- 
А; s 
2 ° 
и ; : TABLE 1 
STATISTICAL ASPECTS OF SEDIMENTARY TEXTURES, 
COMPOSPTION,, AND MASS PROPERTIES 
E Frequency Distributions* 
LJ 
Егорену елш Within single | Among closely 
samples spaced samples 
Particle Size | Expressed as sieve anesh, | Log normal “© «| Log mean is normally dis- 
intercepts, or in terms of tributed. 
settling velocity. 
Particle Cube root of ratio between | Normal " Mean sphericity is nor- 
*  Bpheriity | particle volume and vol- mally distributed. 
ume of circumscribing 
sphere. ° 
* v ° $ 
* Particle Ratio of @verage radii of | Normal Mean roundness is nor- 
Roundness | edges to radius of circle in- e mally distributed. 
scribed in maximum pro- E ° © ° 2 
jection plane, D 
v 
Particle Surfaco | Minute surface irreglari- bat ha Percentage of frosted 
Textures ties on pæticles. Defini- ?« o| grains is normally dis-9 
qf tione not quéntified. tributed. 
Particle | Orientation of particle axes | Normal or circular Mean orientation is nor- 
Orientation or planes in space. » К formal mally distributed. € 
DERE 
Mineral Percentage composition of | Discrete distributions, | Percentages of some min- 
Composition | minerals present. ө | Binomial and,Poisson(?) lla are normally dis- 
distributions. tributed. 
* * 
Porosity Percentage of pore space | Normal Normally distributed. 
in aggregate, © 
Permeability | Measure of «рве of fluid | Log normal?) g Log normalg) distribu- 
flow through aggregates. tion. 
MoT = 
Natural Percentage of moisture in | Normal Normally distributed. * 

Moisture freshly collected samples. LÀ 

Content, . ( 

л. БЕЛЕЕ e 
dis DACePtions to the generalizations in these columns occur, but available data suggest that most 

Tibutions approach normaley or log nonmaley in their behavior. 

. ° 
° е 
© е e „== 
e [] Mere е5 е e 


54 б AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 14 | 


position. Particle shape (sphericity and roundness) has been fairly ex. 
tensively studied, whereas surface textures (such as frosted grains, stri- 
ated grains, etc.) have hardly been approached statistically hecause the 
definitions cannot at present be operationally coàverted to numbers, 
The interested reader will find'a discussion of methods and geological 
evaluation of the results in Krumbein and Pettijohn [18] and Pettijohn 
[27]. Я i 

Aggregate properties (mass properties) of sediments depend on the 
associations of particles present in the deposit. Only threé of a large, 
number of aggregate properties are listed in the-table. Some statistical - 
work has been done on most mass properties, but in many instances it 
was confined-to determination.of mean values and degrees of spread, | 

Table 1 distinguishes between distributions of the variates within» 
single samples, as against distributions of mean values from closely- 
spaced samples, Mass properties usually yield only a single value per 
sediment sample, although subsamples from a larger sample tend to 
distribute themselves as shown. 

Information available for the last column of Table 1 is somewhat 
meager, although the kinds cf distributions listed appear to apply. In 
many instances numerical characteristics of sedimentary phenomena » 
show exponential rates of change when studied over long distances or 
in large areas. Frequency distributions of sample means taken over,such 
larger areas sometimes are skewed, and may in some instances approach 
log normalcy. Although some Percentage data are normally distributed 
as indicated, exceptions occur among rarer constituents. 

Because size distributicns апа mineral data were among the earliest 
‘Investigated, they are used here for illustration fò indicate the growth 
of statistical methodology in sedimentation. N 


PARTICLE SIZE DISTRIBUTION 37 SEDIMENTS 

: Sedimentary particles range in size from the order of 10-4 to 104mm. 
in diameter. Some sediments such as glacial till include this entire 
range; others have very restricted ranges of size, such as dune sand, 
which extends from about 0.1 to 1.0 mm. “All workers with soils and 
sediments (geologists, soil Scientists, engineers) realized early that some 
sort of geometric size scaie is necessary to facilitate analysis and permit 
comparison of data. Ipogeology the most widely used grade scale is 
basod on the ratio 2. The reference value is 1.0 mm. and the scale ex- 
, tends in both directions, as 2, 4,8--- mm, and}, 1, -.- mm. 

_ Aside from technical problems of sije analysis, not considered here; 


` 


STATISTICAL METHODS APPLIED TO ROCKS 55 


early workers had the problem of statistical analysis of data arranged 
in unequal classes. It was felt that each size grade should have equal 
geometric significance (ї.е., a change from 1 to 2 microns may be as 
important as a chamge from 1 to 2 mm.), and the uniformity or non- 
uniformity of the distribution'should:be expressible in such manner 
that fine or cearse,sediments сап be described in similar terms. These 
conditions avere satisfied at an early date by the simple expedient of 
drawing histograms with equal width blocks for each geometric grade 
size, regardless ofthe absolute value of the class limits. The earliest 
such histograms known to the writer were used by Udden in 1898 [35]. 
There i$ mo clear evidence that an implied log transformation was 
recognized at the time, although shortly after arithmetic cumulative 
curves were introduced in 1920, they were converted to their log equiv- 
alents by plotting them gn semilog paper. These practices are still fol- 
lowed, inasmuch as nearly all histograms are shown with equal width 
blocks, and most cumulative curves are drawn on semilog paper for 
direct reading of median and quartiles. Log probability paper was used 
inthe middle 1930's and in 1936 the writer [17] introduced a log trans- 
formation to facilitate conventional statistical analysis. The trans- 
a formation is given by the relation ¢=—loged, where d is the diameter 
in mm. The minus sign was used for adjustment of graphical methods 
commonly used by geologists. The phi notation permitted direct ap- 
plication of moment analysis to size data and permitted definition of 
a normal phi curve described by the phi ёар and phi standard devia- 
tion. This concept was extended to a phi Gram-Charlier series, and be- 
came the basis for Sraphic, methods, intfoduced by Otto [25] and ex- 
tended by Inman [13]. Use of the phi meah and phi standard deviatior 
instead of sfedian and quartiles also permits convenient extensions of 
analysis te curve fitting, use of Chi square, applications of analysis of 
varane, etc. e faa 
Although many sediments approach log normaley, the finer-grained 
sediments tend to bé only partly symmettized by tke phi transforma- 
tion. The writer experimented with other transformations to normalize 
these distributions, but og the whole it seems that most sediments can 
be described by the first four phi moments. i 
A problem of some interest in size analysis is the use of weight per- 
centage frequency instead of number frequency. This question has been 
examined by Krumbein and Pettijohn [18], on the basis of earlier work 
by Hatch [14], who showed that if the number frequency distribution 


" а. H H LJ 
is log normal, the corresponding weight-frequency-distribution is also 


e 
= © i pt ° 


56 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


log normal, with the same log standard deviation, but with a system- 
atic change in the log mean. In part, the problem appears to be one of 
convenience ; for most analyses the number of grains per sample is very 
large and weighing is a more convenient means of expression. Ín coarse 
sediments where individual pebbles caa be handled, and in microscopic 
examination of loose grains, number frequency is used. 


TABLE 2 
SEDIMENTARY APPLICATIONS OF STATISTICAL DATA >i 
— 
Application Examples; „ 


a 
Histograms, cumulative curves, fre- 
quency curves, scatter diagrams, etc. 


1. Graphic presentation 


* 2. Summarized description and com- | Summaries of means, standard devia- 


parison tions, and other parameters; tests of 
› » relations among sediments. 
3. Classification of sediments Statistical data as a basis for textural 
3 and other groupings of sediments. 


4, Study of sedimentary character- | Graphs and maps showing systematic 
istics as functions of time and | changes in sediments along streams, 
space 5 2 beaches, or over àreas. Experimental 

N design as a basis for studying population 
› s: > gradients and changes. 

аас рас EATA 3 

5. Use of statistical data in devetop- | Comparison of field and laboratory data 

» ment and testing of gdynamie theo- | on sediment transportation and deposi- 
ries of sediment behavior and va- | tion with theor'es of sediment behavior 
riation derived in part from stage (4). 

= n ү 


Frequency is always the dependent variable in ae analysis and al- 
though the physival interpretation of thé data may vary with the man- 
ner of expressing frequency, the geometrical significance of the statisti- 
cal measures is the same regardless of the particular choice of frequency 
expression. The log transformation does not affect the relative area 
under any grade size block and hence is indeperident of the kind of 
frequency used. md 2 ] 

The statistical дайа of particle size analysis have been used in various 

„ ways. Table 2 summarizes these uses, which in a broad way аге the 
> samé as for other sciences. Most efforbin the study of sediments has 


j^ 
, 


7 STATISTICAL METHODS APPLIED TO ROCKS 57 


been directed toward the first three categories. These descriptive uses 
have been necessary in the sense that the characteristics of sediments 
had first,to be determined, and studies were needed of the areal varia- 
tions in tese charaéteristics over any given sedimentary environment. 
Once the range of values obserVed in rature was determined and some 
insight gained into, the patterns of areal variation, it became possible 
to relate these observations to theory. In some instances theory pre- 
cedes practice, and the observational data are used to test the theoreti- 
wal structure. o 
rcs MINERAL ANALYSIS OF SAND 
The composition of sediments is expressed €ither in terms of chemical 
„Ог mineralogical composition. Among sediments most extensively 
studied for mineral composition are sands, and in this section these 
- sediments will be used as an example. s y; 
Most sand deposits are. composed predominantly of quartz, but 
nearly all sands have smal amounts of datk minerals which are im- 
portant in sediment interpretation. These minerals have a greater 
Specific gravity than quartz and are separated from it with heavy 
» liquids. The heavy minerals, which are thus separated out, may com- 
prise from 0.Ь$о 5 or 10 per cent of the sand. 
In analyzing һе heavy minerals, it is found that they may consist of 
few to a dozeh or môre species. A sample of several hundred to a thou- 
Sand grains is studied under the microscope and the frequency of ће E 
Species present is indicated as a*number percentage. Inasmuch as there 
18 no gradation amofig the mineral species, the distributions are multi- 
Variate with discretety measured variables.* o $ 
. The heavy mineral data are used partly to determine the kinds of 
Source rocks which supplied the sediments, and partly to help decide 
Whether two laygrs of sediment may be stratigraphically equivalent. 
Various Statistical devices have been developed for these purposes. 
In many early studies of he&vy mineral, the relaflve abundance of 
Species was indicated by such terms as “common,” “rare,” etc., al- 
though the use of numbeg or percentage frequencies became common 
in the early 1920's. Problems quickly arose regarding the number of 
Stains to be counted ih order to avoid undue errors in estinfating the 
Tarer grain frequencies, Dryden [8] applie¢=arobable error theory to e 
the problem in 1931 and concluded that about 300 grains should be: 
Counted. A setond problem, investigated in 1935 hy Dryden [9], con- , 
cerned the comparison of heawy mineral suites among samples from 


е 


58 — AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH и 


different strata to test the geological equivalency of the beds. Bs ap- 
proach led to some discussion with Eisenhart [10] regarding the relative 
advantages of the correlation coefficient and the Chi-square test. Eisen- 
hart’s discussion clarified an important question inctreating dich data, 
and furnishes an excellent example of:the contributions which statisti- 
cians can make to subject matter fields. 2 

In 1944 Allen [1] reviewed the problem of applying statistical meth- 
ods to mineralogical data. In 1949 he [2] applied the methods to a 
study of mineral variations in some Cretaceous deposits in England: 
Maps were presented showing areal variation in heavy mineral per- 
centages, and scatter diagrams with regression lines shoved relations 
among mineral composition, particle size, etc. Ailen demonstrated that 
certain “patchy” occurrences of minerals were due to small scale nearly А 
random processes, which however did not seriously affect the regional 


` picture of mineral variation. Y 


Allen's work, in common with other heayy mineral studies, was di- 
rected toward showing the'provenance (place of origin and kind of par- 
ent rock) of the sediments and the main directions of material trans- 
port. Other studies are directed toward determination of the “stability” 
of minerals, i.e., their tendency to persist for long distances or times of , 
transport. Various authors have explored this problem. statistically; 
Pettijohn [26] showed that an order of mineral stability could be 
erected which indicates the relative persistence of heavy minerali due | 
to resistance to, abrasion, solution, or decomposition. 

Ў The statistical nature of heavy mineral distributions has not been 
given intensive treatment, iv the, literature. Most sands consist of а 
Marge preponderance: of quartz (plus detrital cher’, feldspar, and some 
other “light minerals”) and small amounts of héavy minévals, All the 
minerals have discrete number froquency distributions. The more | 
abundant ones follow the binomial law, and the rarer ones appear to be 
Poisson distributions! Many minerals show a normal distribution of 
mean percentages in closely'spaced saniples, as suggested in Table 1. 

Observed and expected distributions of pebble frequencies in Lake 
Michigan beach gravel are shown in Table $, based on studies by the 
writer. One hundred samples of 10 pebbles each were drawn at random 
from a small beach area, and the occurrences per sample of each rock 
type were noted. The gra::3l consists of about 59 per cent limestone, 35 
per gent chert, 10 per cent basalt, and 5 per cent granite. The limestone 


11. W. Burr, in his ora’ discussion of this : a 
NEU oe Paper, suggested that the data of rarer minerals may 
binomial distributions of low probability rather than Poig, on. This Point is considered below. 


\ 


59 


and cher values agree with their expected binomial distributions with 
P of the order of 0.50 by a Chi-square test. The limestone data are 

shown in фе left of Table 3. The granite and basalt data agree with | 
expected Poisson distributions, again with P exceeding 0.50. Following 

Burr's suggestion, however, the'granite*data were also compared with 
а binomal of lew probability, and the Chi-square test yielded a Р of 
about the same value. The right hand part of Table 3 shows the granite 


STATISTICAL METHODS APPLIED TO rocks? 


qe 2 Ў TABLE 3 
* LITHOLOGIC,COMPOSITION OF LAKE MICHIGAN 
ous BEACH PEBBLES re 
w- n 
Number of Limestone е Granite 
* Occurrences 
er Sub- b Poisson Binomial 
ee Observed Expegted | Observed] курее | Expected 
0 0 ETIE) 58 58.8 59.9 
1 1 1.0 33 31.2 31.5 
2 6 4.4 7 8.3 7.5 
3 7 11.7 “2 1.5 1.0 
4 23 20.5 0 0.2 0.1 
5 i 26 24.6 
6 21 20.5 
° 7 1 +12 1.7 | o| * 
8 $58 4.4 К 
9 1 1.0 Ss St n 
10 0 0°1 2 
а [3 йл шы 
Total Number Ec ve 
of Subsamples 100e 100.0 100 100.0 100.0 


МА 
9 


* ° 
' data and expected values for both Poisson and low-probability bino 
ша] distributions. : 
The writer has not*explored these implications fully} but Burr's sug- 
gestion opens a fresh viewpoint in the study of sediment composition. 
It furnishes an additionalexample of the contributions which statisti- 
cians can make to subject matter investigations. Geologically the ques- 
tion is important because it affects interpretation of mineral data from 
Samples collected along lineseof natural transport, as in streams. 
Abundant minerals of low stability, which display binomial distrjbu- 
tions near thefr source, become depleted as solution and abrasion act 
on them during transport. Do the binomial distributions merely change 


2 


60 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 194 


by decrease of p, or is there some point at which the binomial Jaw gives 
way to a Poisson law? : 

Tt is evident from the preceding diseussion that much repiains to be 
learned about the multivariate distributions of minerals in sedimentary 
deposits. Experimental desigm has entered the field only slightly, and 
there is ample opportunity for fundamental research in this aspect of 
sedimentary petrology. 2 


RELATIONS AMONG SEDIMENTARY PROPERTIES. n 
T 


A wide variety of sedimentary techniques has been used in studying 
relations among properties of sedimentary rocks. Scatter dingrams and 
correlation coefficients ауе been widely used in testing relations 
among particle size, shape, composition, and the like [18, Chapter 9], 
Many sedimentary characteristics vary exponentially with distance 
from source, and relations between the properties themselves inde- 
pendent of distance are commonly power functions as may be expected. 

In the study of ancien?’ sediments, the eonditions of origin must be 
wholly inferred from relations among sediment properties and enclosed 
fossil organisms. In part, present day sediments are investigated to 
provide some basis for such interpretation. Comparisons of sediments: 
formed under known conditions also yield data on the extent to which 
similar sediments may be produced by different environments. Beach 
sand and dune sand provide an excellent example, inasmuch as the 
responsible agentg are waves and currents on one hand and wind on the 
other. In many instances the dune sard is derived from beach sand by | 
selective wind transport,.so ‘that, a continuous gradation is commonly | 

»discernible between the w6. The question whether a single unknown 
sample is either beach or dune sand usually cannot be answered, al- | 
though some separation of the populations can be effected with a group : 
of samples. ? » » | 
Tests of relations among sedimentary properties commonly do not 
include evaluatiàn of experimental and? other errors. Chief reliance has 
been placed on the apparent spread or concentration of data in scatter 
diagrams. Where high correlation exists, and, the data are not confused 
by use of inappropriate ratios, these analyses are probably sound. The 
problemsof ratios and rates in studying geological data is one that re- 
quires further study. Seme variables»(such as grain sphericity and 
roundness) are defined in terms of ratios and yet show essentially nor- 
mal distributions. In some instances the use of ratio§ may disguise 
relations among raw data as Chayes [4] pointed out in 1949. 


| 


> › А i 


: | 
2 » J j у I 


STATISTICAL METHODS APPLIED TO ROCKS 61 


Mer geological studies suffer somewhat from failure to relate sam- 
ple statistics to the corresponding population parameters. In many 
instanceg the sample statistics are used directly without determining 
oe MAE or evithout applying tests for normalcy. Fortunately, 
most sediment properties have sufficient variation areally so that the 
experimental.errors do not unduly cloud the relations. Many sedi- 
mentary samples display slightly skewed distributions, and some are 
highly skewed.-A large number of the former show a reasonable proba- 
т bility of cothing from normal populations (P commonly is greater than 
0.10 ih Chi-square tests). In part the generalization of Table 1 is based 
on such fests. Many samples are sufficiently skewed to reduce the 
likelihood that they came fram normal populations, but relatively little 
o has been done to investigate the geological conditions which may pro- 
duce skewness, Similarly, the peakedness of some sedimentary distribu- 
tions is greater than normal populations show. As with skewness, little 
has been done on this aspect, although there are suggestions that long- 
continued movement and agitation of sediments: by geological agents 
may produce highly peaked symmetrical size frequency curves. 

A large number of measured characteristics of sediments show bi- 

e modal or polymodal tendencies, which seem mainly to be the result of 
mixing effects under rapidly-shifting environmental conditions or of 
composite sampling which includes more than a single population. Con- 
sidering that? some sedimentary laminae mày be a millimeter or less 
thick, the mechanical process of obtaining ап unmixed sample may pre- 
sent а serious problem. Otto [24] introduced the concept of a “sedirmen- 
tation unit” in 1939as a basis for critica? sampling of thin units. 

. СД ә 
^ SAMPIING PROBLEMS IN SEDIMENTATION 
` The foregoing brief diseussiong of size and mineral data indicate 
Some ef the majn lines of statistical development in sedimentation. 
Most other sedimentary attributes were quantified after size analysis 
had become established, so th&t they were able to profit by the statis- 
tical experience of earlier work. FEES. 

Several problems only, partly solved are shared in common by all 
Sedimentary fields. A principal one is that of sampling. It has been 
known for some time that the means pf samples collected in авта area 
tend to be normally distributed for many sedimentary properties. If 
the Samples from any one deposit are spread over a larger area, the dis- 
tribution of sample means tends to develop a larger variance or to be- ү 
come skewed, In part, these changes are related to areal variations in « 


>» 


о 


62 AMERICAN STATISIICAL ASSOCIATION JOURNAL, 


the population brought about by changes in the physical and 
conditions of sedimentation over the large area. Maps which b 
these systematic changes can be made of the mean values or 
ances. è 
Two kinds of sampling problems commonly arise in sedim 
studies. One is concerned with the small-scale local variations | 
sediment, and the other involves the large-scale or regiona) va 1 
These problems are met in oil exploration, for example. Oil oc 
sedimentary rocks and shows some relation to optimum &edim 
conditions. In studying potential oil-bearing areas, how far apart 
samples be spaced to bring out the regional sedimentary rends? ] 
closely should they be spazed to bring out,the local departures fr 
regional trends? In part, oil occurs in areas which show anomé 
variations from the regional picture. As some guide to the scale of 
ing, regional maps may cover areas as great as 100,000 square n 
Regional samples from boreholes may be spaced about one per 
square miles, and maximuin close spacing is about 60 wells per squal 
mile in thoroughly explored areas. : 7 у 
The study of recent sediments may be illustrated by the proble 
sampling a beach 200 feet wide and several miles long. The dep: 
vary across the beach, along the beach, and with depth below the 
face. The total volume of sediment may be of the order of 10° eub 
feet, and the number of grains in the population may be of the: 
of 101%, Along with the sand samples, data are to be collected on b 
slopes, wave energy, strength of currents, etc., so that relations bet; 
sand characteristics and geological processes can be studied. How n 
samples should be taker; how should they be distributed over 
beach, and how deeply should each sample penetrate the sand la; 
UTE. of such sampling problems have been largely empiri 
the upper layers are to be emphasized as being most nearly relat 
contemporary processes, elosely spaced shallow samples commonly 
collected along bzach profiles spaced about 4 mile apart. An aversi 
the samples along the profiles is used to characterize each 3-mile 
along the beach. Presumably, the closely spaced samples compensa 
for local variations and the widely spaced Averages bring put 
sedimentury trends. $ 
Designed experiments, which include,the a layout and 
the number and kinds of samples necessary for any given study, ! 
to offer one ofthe best approaches toward the sampling problen 
analysis of variance methods become rpore familiar to geologists, 
likely that planned experiments will dominate over the somewhat 


STATISTICAL METHODS APPLIED TO rocks 68 


organized field studies which have been the rule in the past. Cochran’s 
recent bbok, [7], published since this manuscript was written, carries 
many sudgestions for stratified or systematic sampling plans which can 
be designed for particle populations. 

Some studies have been directed toward evaluation of sampling and 
laboratory errors in size and mineral analysis. The writer [16] studied 
the probable error of sampling beach sands in 1934; more recently 
Griffiths [13] applied analysis of variance to the same data to sharpen 
the copcepts. In 1937 Otto [23] applied Shewart’s theory of control to 
the improvement of & splitter for obtaining subsamples from larger 
field sampfes* s j ) 

Most error gtudies inelude'evaluation of laboratory errors (sample 

esplitting, sieving, weighing, etc.), as well as field variation from sample 
to sample. As an indication of magnitudes, it was found that the labora- 
tory error in particle size analysis of beach sand was only about $ as 
large as the field sampling error (0.54 against 4.5 per cent) for the 
average diameter. In heavy mineral studies of the same samples the 
average laboratory error, assigned mainly to counting errors, was about. 
equal to the field sampling error. Each was of the order of 10 per cent. 
o Griffiths and co-workers [11, 12, 31] applied two- and three-factor 
analysis of variance models to evaluation of sample, operator, and 
techgique effects їр grain orientation and porpsity studies. The designs 
included evaluation “of interactions. Perhaps the earliest analysis of 
variance study in the geological literature is that of Swineford smd 
Swineford [33] published in 1946, in which a three-factor model was 
used to test the relative effigiency of sieve shaking equipment. E 

It is in the field of experimental design that some of the more im- 
portant advances in statistical analysis of geological data will be made, 
Unlike many physical sciences, laljoratory experimentation is not the. 
main sdurce of observational data in geology. Rather, the geologist 
must rely on field observations of natural processes and deposits for his 
data. Experimentation plays ifs part in studies of sand movement in 
water flumes and wind tunnels. 9 


e 
. CONCLUDING REMARKS 


The original notes for this paper were prepared during thé autumn 
of 1952. It was apparent at the*time that rapid developments could be 
expected in application of newer methods of statistical analysic to 
geological data. Important papers on analysis of variance had appeared 
in several fields [3, 6, 22], probtems of sampling were being examined 
more closely, and workers were beginning to relate sample statistics to 


e 


d P : ч js 
© e <. o 


64 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


population parameters [34]. The Earth Science Panel of the Committee 
on Statistics in the Physical Sciences of the A.S.A. had been organized 
and was bringing interested workers together. Early in 1953.4 sympo- 
sium of papers on statistics in geology was organized under R. L. Mil- 
ler's supervision by the Journal of Geology. The November, 1953, and 
January, 1954 issues are devoted to the subject, and include several 
papers reviewing or extending applications of statisticaleanalysis to 
sediments. ; e 

In some respects the last several years mark a resurgence of statistical 
interest in geology, directed toward more critical analysis of sedimen- 
tary data, and especially toward applications of experinfental design. 
This is in contrast to earlier principal interest in developing and adapt- 
ing techniques for description and classification of sediments, which 
reached its climax in late prewar years. A glance at the 1953 develop- 
ments in the expanded bibliography [12, 13, 19, 20, 21, 29, 31] will 
indicate the directions of some of these later trends in sedimentary 
statistical analysis. Gi y 
» In the newer developments of statistical application geologists are 
becoming increasingly aware that progress in advanced statistical anly- 


sis requires co-operation from statisticians. Opportunities for fuller • + 


collaboration between earth scientists and statisticians, such as are 
provided by the Committee on Statistics in the Physical Sciences, will 
do much to bring subject-matter and methodology groups together, 


5 REFERENGES _ 
Ш ш Percival, “Statisties fh sedimentary petrology,” Nature, 153 (1944), 
н 1-77. > жэ. ә 


12] Allen, Percival, “Wealdon petrology,” Quarterly Tournal ofthe Geological 
* Society, 104 (1949), 257-321, ‘ c 
e[3] Burma, B. H., *Studies in quantitative palgontology,” Journal of Paleon- 

tology, 22 (1948), 725-61; 23 (1949), 95-103. > н 

[4] Chayes, Е., “On ratio correlation in petrography," Jourgal of Geology, 57 
(1949), 239-58 * sd У 

[5] Chayes, Е., “Statistical analysis of two-dimensional fabric diagrams,” in 

' Н. W. Fairbairn, Structural Petrology of Deformed. Rocks, Cambridge, Mas- 

sachusetts: Addison-Wesley Press, 1949, 2972307. , 

[6] Chayes, F; “The finer-grained calkalkaline granites of New England," 
Jourifal of Geology, 60 (1952), 207-54. 4 

[7] rae yd G., Sampling Techniques, New York: John Wiley and 
long, Я 

[8] Dryden, Lincoln, “Accuracy in Percentage representation 9f heavy mineral 


o m Proceedings of the National Academy of Sciences, 17 (1931). 
233-38. е 


—- 


. 


N 


& { 
STATISTICAL METHODS APPLIED TO ROCKS 65 


[9] Dryden, Lincoln, “A statistical method for the compari i 
r oln, І parison of heavy mineral 
ed n E d prs of Science, 229 (1935), 393-408. j 
pa i" аг Я : ai ill, ^A test for the significance of lithological variations," 
ео of Sedimentary Petrology, 5 (1935), 137-45. 
[11] Gri ths, J. С, and Rosenfeld, M. A. “Progress in measurement of grain 
orientation in Bradford sand," Producers Monthly 15 (1951), 24-36. 
[12] dum J.C; and Rosenfeld, M. A., “A further test of dimensional orienta» 
am S diee pue in Bradford sand," American Journal of Science, 251 
413] соя lp ойбу, 3 (1953) ФЕ error in grain size analysis,” Ji ournal of Sedi- 
, , 75-84. 
[14] ‘aed b eos of average particle size from the screen analysis 
ape orm perpeti os Journal of the Franklin Institute, 
, 27-37. 
[15] ie. Douglas, "Measures for describing the size distributions of sedi- 
ré ud ©) J ournal of. Sedamentary Petrology, 22 (1952), 125-45. 
e E b see €. The probable error of sampling sediments for mechanical 
i "rs Live? тезе Journal of Science, 227 (1934), 204-14. 
К m ein, w. С., Applications of logarithmic moments to size frequency 
utions of sediments,” Journal of Sedimentary Petrology, 6 (1936), 
[18] кшт, yc jand heise. Manual of Sedimentary Petrography. 
rk: Appleton-Century, У 
[19] pun W. C., and Miller, R. L., *Design of experiments for statistical 
jx mM o geological data,” Journalof Geology, 61 (1953), 510-32. 
E mbein, W..C., Statistical designs for sagnpling beach sand,” Transac~ 
B н of the Amegidin Geophysical Union, 34 (1953), 857-68. 
s Ripe John 9, Application of statistical estimation and hypothesis 
T ing to geological data," Journal of Geology; 61 (1958), 544-56. — ' 
D * . E., “Ац application of the analysis of variance to paleontology,” 
E Vee of Paleontology, 23 (1949), 685-40. » 2 
Ke o : eorge H., “The use of statistical methüds in €ffecting improvements 
ee sample splitter,” Journal of Sedimentary Petrology, 7 (1937), 
24 imentati 
[24] poo H., “The sedimentétion unit and its use in field sampling,” 
0) edog; 
[25] Otto, George H a v eir eee ili int 
Be ge А modi | logarithmic FORY gr for interpreta- 
d oO] = балш analyses of sediments,” Journal of Sedimentary Petrology, 
26 XN , . с ° 2 
[26] E E T Nie t of heavy minerals and geologic age,” Journal 
А 10-25. 
[2 ij ;, 
n E unn, F. J., Sedimentary Rocks. New York: Harper and Brojers, 1949. 
S oward J., "Statistical methods applied to the study of rock frac- 
Bo Phe | oe of the Geologic@ Society of America, 62 (1951), 81-130. 
gae : oward J., “The analysis of aggregates of orientation data in garth 
[20] ae Sournal of Geology, 61 (1953), 482-509. e 
TOR атту, “An analysis of жир к in the'Cocenino sandstene," 
of Geology, 46 (1938), 905-32. ; 


E e Б = . ^. 


€ 
€ 


» 


0 
66 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


[81] Rosenfeld, M. A., and Griffiths, J. C., *An experimental test of ра com- 
parison technique in estimating two dimensional sphericity and #oundness 
of quartz grains,” American Journal of Science, 251 (1953), 553-8. 

[32] Simpson, G. G., and Roe, A., Quantitative Zoology. Now York: McGraw-Hill 
Book Company, 1939.. d 

[33] Swineford, Ada, dnd Swineford, Frances, “A comparison of three sieve shak- 
ers,” Journal of Sedimentary Petrology, 16 (1946), 3-13. ` 

[34] Strahler, Arthur N., «Equilibrium theory of erosional slopes approached by 
frequency distribution analysis,” American Journal of Science, 248 (1950), 
673-96; 800-14. f ] 

[85] Udden, Johan August, "The mechanical composition of wind deyosits,"" 
Augustana Library Publications No. 1, 1898. 


2 


° © 


& 


RELATIONSHIP BETWEEN AN INDEX OF HOUS 
PRICES AND BUILDING COSTS* — 
2 e 


Davi» M. BLANK ) 
Columbia University ure 


EFLATION of residentjal wealth and residential construction ex- 
penditure'estimates to constant dollar levels in principle requires - 
C uthe uge of a price index of residential construction. However, no na- 
tional market price fndex covering a reasonably long period of time | 
exists, althofigh house, price indexes have been constructed for several 
T cities, usually. covering a relatively few years.! Consequently, a con- 
^» struction cost index is typically used as a substitute, on the view that 
the movement of such sn index is a resonable reflection of changes in 
new house prices. This article attempts to assess the validity of this as- i 
sumption and to judge the margins of errpr involved in employing a 
_ construction cost index as å deflator. ; 


POSSIBLE DIVERGENCE BETWEEN CPST AND PRICE INDEXES 


It could be reasoned that significant short and long-term divergences 
might arise between a valid index ọf the market price of homes and 
indexes of construction cost. These divergenges can be assigned to two 
causes: technical*problems in defining and measuring construction 
costs and real deviations between new and old house prices. e + 

Construction cost indexes usdally exclude builders’ profits and, often, 
overhead charges, or add asconstantepercentage to direct cost to cover 
these items., The apparently wide short-terin vatiability in builders" 
Profits thus permits significant differences to arise between the move- 
ment of prices of new homes and dbst indexes. To the extent that there 
has Беёп a secular movement of builders’ profits and overhead costs 
which is not taken account of in construction cost indexes, even secular 
_ divergences may arise. is 
|... More importantly, technical problenfs inherent in devising construc- 

_ tion cost indexes involve*at least the possibility of deviations between 
à * Some of th Т Pur CP БЕ? = Capita Formation in. — 
А Leni воа Estate by 1 Leo Grebler, David eesti tend Shin (pe published: 
; te ional Bureau of Economic Reseangh. f Se » x ” 
"For example, Toledo—William Hoad, Real Estate Prices (unpublished doctoral dissertation, Uni 
Tasty of Michigan, 1942); Wasbington—unpublished data from the Housing and Home Finance 

Ney, quoted in Brest M. Fisher, Urban Real Estate Markets: Characteristics and Financing (New , 

ork: National Bureau of Economic Research, 1951), p. 54: Ann Arborf-Herman Wyngardem, “An | | 


uds оі Real Estate Price," Michiga Business Studies (Ann. Arbor University of Michigan, 
ry ). N 


68 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


such indexes and a true price index of new homes. Most residential 
construction cost indexes apparently are derived as some íbrm of 
weighted average of materials prices and wage rates. They а т in the 
number of materials and labor skills covered and in the degree to which 
the weights are based on specific rather than generalized, types of con- 
Struction. The weights are usually unchanged, or changed little, over 
the entire period covered. Such indexes suffer from several defects. One 
such defect results from the fact that indexes of this kind cannot take 


fully into account the changing importance or the relative price move-', 


ments of new materials and equipment which have been added to the 
house over time. In addition, for early years there is a serious question 
as to whether actual prices and wage rates, rather than nominal prices 
and rates, have entered into such indexes, Finally, such indexes are 
unable to take into account changes in site productivity. 

If these technical problems were solved, cost indexes would properly 
measure the changes in prices of new homes. However, discrepancies 
between such cost and pride indexes and an index of old home prices 
could still arise, Because of the interconnection between the markets 
for new and old homes their price movements should be in close con- 
formity at most times. Nevertheless, divergences could appear at the 
trough of the building cycle when the prices of existing homes may 
sink below the price 3t which new houses would be offered on the mar- 
ket if there were any building activity. Indeed, it is’for this reason that 
construction volume sometimes declines to more or less negligible levels. 
Discrepancies could also appear in the upswing of the cycle during 
short periods when new constriction lags behind thé increase in demand 
fór dwelling units; existizz Houses may command, premiums at such 
times because of their immediate availability, At either cycle stage, 
the divergences may last as long as saveral years, » 


2 » 


A NEW PRICE INDEX, 1890-1934 


To test the differences in movement between a representative con- 
struction cost index and the prices of homes, a house price index for 
1890-1934 was developed and compared withthe cost index. The data 
for the price index were derived from the Financial Survey of' Urban 
Housing? which presented financial data and other information for a 
sample of residential structures in 61 cities in 1934. Detailed informa- 
tion їп the Survey is available only for 22 cities. The 22 cities are widely 
m scattered geographizally, with at least two cities representing each of 


2 Financial Survey of Urban Housing (Washington, D. C,: U.S. Department of Commerce, 1937). 
г) 


— ——— 


HOUSE PRICES AND BUILDING COSTS 69 


had only one city in the 22-city sample. One set of questions asked of 
each owner of a residential structure related to: (a) value of the prop- 
erty in 1934, (b) year of acquisition by the then-present owner, and 
(c) original cost to owner at time of acquisition. This information was 
summarized for each city and a table presented for each of the 22 cities, 
listing the number of properties included in the 1934 sample which 
were acquired in each year from 1890 to 1933, the total acquisition cost 

+ of properties acquired in each such year and the value of each group of 
such properties in 1934. Separate data for all owner-occupied and all 
tenant-octupied structures and for all single-family owner-occupied 
houses and all single-family’ tenant-occupied houses were presented, 

* rather than over-all figures for all residential properties. 

. The data selected for analysis were those relating to single-family 
owner-occupied houses, on the view that this relatively homogeneous 
group which comprises the most importapt portion of the nonfarm 
housing stock would show a more consistent pattern than the other 
categories. The all owner-occupied category might have been a reason- 
able alternative, but was rejected becduse it was less homogeneous 

* than the single-family owner-occupied category. The two tenant- 
Occupied segments were rejected because they included too small a 
number of properties and because the all tenant-octupied group was too 
heterogeneous. The tenant- and owner-ogcupied data could not be 
combined as they were based on two separate samples*and the size of 
the two samples did not reffect the proportion of owner-occupied 

` and tenant-occupied properties in the regpeĉtive cities. А 

A relativerfor each year was calculated for each city, based on the 
ratio of the total acquisition cost of the single-family owner-occupied 
houses acquired in each given yeaitin a given city to their value in 1934- 
The median relative for each year was then determined. This series of 

« median relatives, based on 1934 values equal to 100, was converted to 
a 1929 base; the converted series is presented in Table 1. 

The assumptions underlying the price index warrant clarification be- 
fore any comparisons aré drawn. It is assumed, first, that the acquisi- 
tion cost estimates are reasonably accurate. In all likelihood, the esti- 


the nen divisions except the East South Central division which 


* To determine the effect of the вресібе averaging procedure'on the final results, a test, was per- 
formed on the data for a single year in each of the four full decades covered. The relatives fgg each 
TERN Were combined in the form of the median, positional mean, unweighted arithmetic mean, un- 
x ies Seometric mean, and weighted arithmetic mean (in which the weights were the number of 
so thay Idein each city at the nearest censal year). The range of results in each year was relatively small, 
city wean’ simplest measure, the median, was used in the computations for the final series. Individual 

Y relatives based on less than four properties were disregarded in the computation of the median. 


. . 
. 
. - Й I 


>» 


70 - AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1054 


mates of acquisition cost for properties acquired in the early years of the 
period studies have some margin of error. It is also assumed ‘that the 
year of acquisition has been accurately reported; here again,Zhere un- 
doubtedly are significant error margins for the early years, with a 
tendency for respondents to report acquisitions in years which are mul- 
tiples of five. Finally, it is assumed that the movement between median 
relatives of two successive years approximates the movement in prices 
of a single sample between the two years; it will be remembered that 
each relative, before conversion to a 1929 base, actually represents the 


TABLE 1 » 


UNADJUSTED PRICE INDEX OF ONE-FAMILY OWNER- 
OCCUPIED HOUSES, 22 CITIES, 1890-1934 
(1929 —100)* ` 


= == 
Year Index . Year Index 
т ” 
1890 61.3 1915 71.7 
1891 55.3 1916 78.5 
1892 56,3 o 1917 80.1 
1893 58.7 1918 85.2 
1894 68.4 1919 93.7 
1895 ? 62.5 1920 , * 102.7 © 
1896 53.8 | 1921 ^ 100.4 
PROUT sats. BBB 1922 101.8 
1898 59.1 > 1923 103.3 
1899 56.5, > . 1924 ` 103.5 
n 5 y 
1900 64:6 1925. `ù 108.9 
тоот 54.2 1926 104.5 
3 1902 63.9 » 1927 200.6 
1903 64.9 QURE 102;1 
1904 67.9 1999  — 100.0 
2 › › : 
1905 59.5 1930 95.7 
1906 70:85:58 1931 87.9 
1907 77.9 2932 78.7 
1908 70.3 1933 75.7 
1909 68.7 1934 77.9 
1910 74:2 s А 
* 1911 72.5 
1912 ULT d ' 
* 1913 75.3 3 
1914 78.1 


=) 
* Yearly median of 22 city relatives, excluding thore relatives based on 3 or less properties, 


y 


HOUSE PRICES AND BUILDING COSTS : PATI eee 


movement in prices of a separate sample between the given year and 
1934. у 

The validity of the 1934 value estimate probably does not seriously 
affect the movement of the price index, except for the 1934 value itself. — 
It would only affect this movement if the degree of underestimate or — 
overestimate of value in 1934 were correlated with length of holding. 

. It should be, pointed out that the constructed price index applies to 
both new and old houses. The relative for a given year relates the 

* acquisition cost ot properties purchased in that year to their value in 
1934, regardless of whether the acquisition was of a new or an old struc- 
ture. A cursory examination of the data indicates that somewhat more _ 
than one-half-of the properties in the 1934 sample which were acquired 

* in the 1890-99 decade were new houses; somewhat more than one-third 
in the 1900-09 decade were ney houses; and somewhat more than one- 
fifth in the remaining years were new houses. It was suggested earlier 
that there should be no reason for any difference in the price movement 
of new and old houses, other than in periods of depressed building 
activity or for short periods during the upswing of the cycle when con- 

, sumer ignorance and relative availability may play a role. At other 
times, the movement of prices of houses of varying age and quality 
should be roughly similar. And the price variations of new housing, 
once it has exteretl the housing stock, shouldebe the same as the original 
stock, subject to fhe same differential depreciation rates as apply to an 
existing housing stock composed of strueturés of different ages. * 

The index in its present form is subjectdio two major offsetting biases, 
viz., value losses dug to depreciation and,obsolescence and value incree 
ments in thé form of structural additions and alterations. The price 
relative for 1904, for example, before conversion to a 1929 base, meas- 
ures the change in price of a given set of properties between 1904 anu 
1934; this chang® is affected by the thirty years of depreciation operat- 
ing on these properties and ig somewhatesmaller than the change in 
Price which would be measured if this group of properties in 1934 had 
the same age structure as they did in 1904. Conversely, any structural 
additions or alterations fo the properties between time of acquisition 
and 1984 would tend to make the price rise,larger betweengthese two 
Periods than the theoretically correct price movement.  ' г 

It is generally accepted that value losses due to depreciation and 
obsolescence typically outweigh value gains due to additions “and 
alterations.. Therefore, the present index must be biased downayard ' 
as the net result of these two kinds of value change. д 

‘ [3 Sane ә ion in Residential Real 
ne E ОНЕ 

d e S DA 2 є 


o 


72 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


Further corroboration for this view is found in a comparison of two 
sets of house price indexes for Cleveland and Seattle (ТаЫе/ә). Опе 
set of indexes comprises the series of relatives for these tio cities, 
‘which, together with the relatives for the remaining 20 cities, provided 
the basis for calculating the 22-city price index. These indexes are 


TABLE 2 


HOUSE PRICE INDEXES, CLEVELAND AND SEATTLE 
(1929 = 100.0) 


Cleveland Seattle ` 
Price Index Price Index 
Year Garfield-Hoad Underlying Garfield-Hoad Underlying 
Price Index 22-City « Price Index 22-City 
Index Index 
a) gee CB) Ung) (4) 
1907 35.4 64.7 
1908 36.6 » 60.8 
1909 40.2 66.5 56.9 76.4 
1910 43.9 59.1 58.8 74.4 
1911 45.1 57 56.9 82.9 
1912 46.0 2 62.0 64.7 » 78.65 
1913 47.6 . 63.8 62.7, 78.0 
„1914 ооо 729 64.7 86.9 
1915 51.2 70.0. 66.7 86.9 
1916 53.7 » 71.0 64.7 77.7 
9017 $8.5 ^, 7720, Seeing 76.3 
1918 ОБЕ ggum 65.7 э 82.1 
, 1919 76.8 89.6 78.4 92.6 
,. 1920 86.8 104.7 » 88.2 ‚ 95.7 
1921 87.8 102.9 ° ^ 86.3 93.5 
1922 91.5 104.6 99.8 88.8 
1923 96.3 SORS a 100.0 94.2 
1924 100.0 113.1 117.6 96.7 
1925 102.4 142.9 109.8 102.9 
1926 103.7 114.5 » 107.8 98.0 
1927 102.4 106.1 99.9 98.2 
1928 » TOL Pe ИО 102.0 99.6 
1929 100.0 100.0 100.0 100.0 
1930 95.1 94.3 > 88.2 92.5 
Sources: 


Columns 1 and 3—Indék derived from 3- i i i i 
house and lot. ей abd Hosd/ oa ni ПАША КС О ee 
Columns 2 and 4—Index for prices of 1-family own ied homes. й 
) -i ler-occupied hi i i 
cial Survey. Index one of 22 underlying 22-city price puo S gram. cee 
= > 
э 


HOUSE PRICES AND BUILDING COSTS 73 


subject to the same biases for depreciation and additions as the 22-city 
index itsylf. 

The second set of indexes were derived in such a manner as to exclude 
any such bias. They’ are based on three-year moving averages of prices 
paid for new owner-occupied single-family homes in Cleveland and 
Seattle, derived by. Frank R. Garfield and William M. Hoad from 
special tabulations of unpyblished data from the Financial Survey of 
Urban Housing.’ From these tabulations Garfield and Hoad were able 
to compute average prices paid for new homes (including the lots under- 
lying the structures) of specified types in each city in each year covered. 
The authors focus atteption primarily on the, price movement of five- 
and six-room frame houses, òn the assumption that changes in the 

stransaction-mix would affect the averages but little, since an analysis 
of the distribution of prices paid for various types of homes purchased 
in Cleveland in 1924 had indicated that these were relatively homo- 
geneous types of structures. The series fore gix-room frame houses in 
each city, converted to indexes with a 1929 base, are given in Table 2. 

The properties underlying the Garfield-Hoad indexes may have been 

subject to changes in size and quality of Structures and in land ratios 

which would result in divergences between these indexes and a valid 
house price index. But such changes were probably severely limited in 
extent due to, theestated homogeneity overetime*of the houses with 
regard to size and type of structure and construction, i.e., 6-room single- 
family frame houses, And the restriction ofethe data to new houees 
specifically excludes any biases'due to Сеа и obsolescence, or 
additions and alterations. e 

A comparison between the two cote of indéXes shows a significantly 
greater rise in the Garfield-Hoad indexes between the pre-World Wag 
I period arf the late twenties thaf in the price indexes for Cleveland= 
and Seattle undesying the 22-city price index. This difference is fully 
consistent with the existence of a downward bias in the 22-city index 
due to the effects of depreciation gross of additions and alterations. 

A detailed examination of empirica? data, undertaken elsewhere, 
Suggests that the decline if value of single-family houses over the first 
52 years of life, resulting from the net effect, of depreciation and 
Obsolescence on the one hand and additions and alterations on the 
Other, approximates thet resulfing from a 1.2 per cent linear rate of 
depreciation. 5 Since the 22-city index is based on movements in $he 


* Frank В. Garfield and William M. Hoad, &Construction Costs and Real A ОНУ Values," Jofirnal 
of 6 American Statistical Association, December 1937, рр. 648-58. 
Grebler, Blank, and Winnick, loc. cit. The data were derived from a special study by the Federal 


» 


» 


1 
74 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 


prices of structures plus land, the depreciation correction for this ind 
also requires a rate based on structures plus land. The relevajat lin 
rate, derived from the same data, is about 1.0 per cent. Jl 
Allstudies of the decline in market value of houses as they age clear); 
indicate that a curvilinear rate of depreciation is more appropriate fo 
residential structures than a linear rate. The compound rate of de-- 
preciation which yields about the same remaining value after 52 years © 
as а 1.0 per cent linear rate, but which approximates more closely the ` 
path of declining value of residential structures as they age, is about Я 
12 per cent. Accordingly, the 22-city index was vorrected for a 1$ per 
cent compound rate of depreciation. The series so calculated, after 
adjustment so that 1929 again equals 109, is presented in Table 3. 
Generally speaking, the corrected price index shows an upward secu- 
lar drift from 1890 to about 1916, a more rapid rise to 1920, a smaller” 
rise to 1925, and a decline thereafter tó 1933. Between 1890 and about 
1925, short cycles of about four years in duration are discernible in the 
data, with peaks appearing in 1894, 1900, 1904, 1907, 1910, 1914, 1920, - 
and 1925.7 Д 


PRICE INDEX COMPARED WITH CONSTRUCTION COST INDEX 


No residential construction cost index covers the entire period from ^ 
1890 to 1934 but the Boeckh residential construction cost index, based: 
on 20 cities, starts in 1910 and can be extrapolated back to 1890 ina 
customary fashion by the use of building materials and building wage 
rate indexes. The Boeckh index is one uf the few adequate construction 
cost indexes available and is the only one aimed specifically at measur 
ing changes in cost of coustruction of residential structures,® 


, The combined index is presented in Table 4. The construction cost 
LJ › 


Housing Administration of a sample of single-family homes appraised by FHA in 1939; from William 
М, Hoad, “Real Estate Prices, A Study of Residential Real Estate in Lucas County, Ohio,” unpublishe 
doctoral dissertation, University of Michigan, Ann Asor, 1942; and from Raymond Goldsmith’ 4 
analysis in “A Perpetual Inventory of National Wealth,” Studies in Income and Wealth, Vol. XIV, 
(New York: National Bureau of Economic Research, 1951) of data gathered by the Financial Survey. 
of Urban Housing. It should be pointed out that & 1.2 per cent linear rate of depreciation for houses is 
significantly below the rates presented in Bulletin F and used by the Department of Commerce an 
most other investigators in this field, even after all adjustments are made for comparability. 

1 i The srt cycle in house ргїсїз approximates closely in length the short cyele found in building - 
a by Long. Clarence D. Long, Jr., Building Cycles and the Theory of Investment (Princeton, 1940) 
p. 104, D 3 

3 E. Н. Boeckh and Associates actually construct ten indexes for different types of structures; i 
botlf residential and nonresidential, for various cities. Two of these indexes, for frame and for brick — 
one- to six-family residential structures, for 20 cities have been combined by the several successive ` 
federal housing agencies into a single residential cost index which is used by the Department of Com= 
merce in deriving the residential construction expenditure component of the deflated Gross National — 
Product series. It is this index which is referred to in the toxt. 1 


" 


HOUSE PRICES AND BUILDING COSTS e T 


TABLE 8 


PRIGE INDEX OF ONE-FAMILY OWNER-OCCUPIED HOUSES, 
t22 CITIES, CORRECTED FOR 1j PER CENT COM- i 
* poUNP ANNUAL DEPRECIATION 1890-1934 
(1929= 100.0) 


——————M———— 


Year » Index Year Index 
eee 
1890 — - 36.0 1910 51:825 
1891 eer oA: 1911 56.7 
21892 „34.0 1912 59.7 
1893 35.9 1913 60.5 
189% * 42.4 914 63.7 
1895 39.0 * 1915 59.2 
1896 34.8 1916 65.8 
1897 35.9 1917 68.0 
1898 За 1918 73.8 
1899 37.5 1919 81.7 
., 

1900 43.5* *^1920 90.8 
1901 37.0 1921 90.0 
1902 42.4 1922 92.5 
1903 45.5 1923 95.2 
1904 48.3 1924 %.7 
1905 42.9 3 1925 103.1 
E1006 2745775110 1926 100.4 
1907 CU viU 1927 97.9 
1908 52.8 * 1928 100.7 
1909 62,8. < 5929 ^" ^ . 100.0 * 
1 ; e ° *1980 97.1 

A Pe * 3931 ¢ 90:4: ae 
1932 82.0 

. » 1933 80.0 ° 

. е е 1934 28:9 RS 


e 
Source: Index, Table 1, corrected for 1} per cént compound annual depreciation. 
i e © 


е 
index for 1890-1934 and the corrected house price index (Table 3) for 
the same period are compared in Chart I. A comparison of the two in- 
dexes suggests two important conclusions with regard to the relation- 
ship between construction costs and house prices. e 

Except for the period 1916—1922,? the price index shows more short- 
run variability thangthe cost index. The latter is quite stable over the 


MISE peni 
The cost ind®x rises to a much sharper peak in 1920 than does the price index. This sharp rise in 


1920 is found in all construction cost indexes and probably reflects a real difference in construction E 
nd prices in that year. It seems to have beh a result of a unique set of supply and transportation a 
Шев in the winter and spring of 1920. „ 


g 


76 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


TABLE 4 c 
RESIDENTIAL CONSTRUCTION COST INDEX, 1890-1954 
(1929 =100.0) 


Year Index а Year Index 
1890 39.2 1915 ,93.5 
1891 37.9 1916 Ето 
1892 36.8 . 1917 66.6 
1893 36.7 1918 э 79.2 
1894 35.4 1919 * 92.1 
pb 
1895 34.9» » 1920 : 118.7 
1896 35.1 1921 95.4 
. 1897 84.4 1922 87.7 
1808 35.9 1928 98.3 
1899 38.5 ^ 1924 90.9 
1900 40.6 >” 1925 96.2 
1901 40.1 1926 96.9 
1902 41.5 1927 95.6 
1903 ? 43.0 à 1928 95.9 
1904 42.5 1929 100.0 
1905 44.5 i 1930 97.5 
1906 8.9 » 19081 , " 89.9 ? 
i 1907 EXER 1932 76.1 
» 1908 э » 49.55 1933 76.2 
1909 51.4 ^ — 1934 82.9 
5 à 
» 1910 Е: 2 
1911 2:57 ^ , 
» 1912 53.8 
æ 1913 51.9 ? 
1914 52.2 Р 
Sources: > » 


2 9 

1890-1906; 1907 value extrapolated by weighted average of an index of average wages per hour 
in the building trades and an index of buibling materials prices. Wage index from Department of 
Commerce and Labor, Bulletin of the Bureau of Labor, No. 77, July 1908; see Historical Statistics, 
P. 66, Price index from Handbook of Labor Statistics, 1941 edition, Vol. 1; see Historical Statistics, 
рр. 233-34. Weights—wages, 1.0; materials, 1.5. Weights derived from NHA analysis of housing 
costs; see Hpusing Statistics Handbgok, p. 32. 

1907-1909: 1910 value extrapolated by weighted average of an index of wage rates in the building 
trades and an index of building materials Prices. Wage vate index from Bureau of Labor Statistics 
annual reports, Union Wages and Hours in the Building Trades; see Historical Statistics, p. 69. Price 
inéx from same source as 1890-1906. Weights same as above. 

1910-1914: 1915 value extrapolated by Boeckh index of residential construction cost, as given in 
Historical Statistics, p. 173, * 
1915-1934: Boeckh residential construction cost index, as given in Construction and Building Ma- 
terials, Statistical Supplement, May 1951, Department of Gommerce, p. 40, converted to 1929 base. 

» 
ә 


DY 


Construction Cost Index 


e 
Price Index 


1 A m 
Cuare I. Price jndex of sifigle-family owner-occupied houses, corrected for 


depreciation, and residential construction cost index, 1890-1934. (1929 = 100.) 


Source: Tables З and 4, ` e e 5) 


pre-1916 period, partly perhaps as a result of the way їп which it is 
constructed, while the pfice index shows substantial fluctuations. Be- 
tween 1905 and 1909, for example, the price index has a rise of more 
than 34 per cent and a fall of almost 10 per cent, as compared with the 
Cost index which rises бшу 15 þer cent between 1905 and 1907 and de- 
clines only 3 per cent between 1907 and 1908. The same relationship 


. holds for the period after 1922; the price index falls:5 per cent between 


1925 and 1927 while the соз index remains almost unchanged. In 
sum, it seems reasonable to Conclude that in most periods the market 


€ ° [i e. € 
> 


е 


9 


1 78 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MAR 


price of homes fluctuates more widely over the short run than do € 
struction costs as measured by standard construction cost in‘lexes, 
a result, the annual movements of any construction series deflated b 
construction cost index are subject to some margin of error. E 
But equally important is the fact that the long-run movement of 
two indexes is remarkably similar. Thus, the constiuction cost in 
1921—1929 is about 245 per cent of its level in 1895-1905; the correc 
price index in 1921—1929 is about 241 per cent of its level in 18 
1905. It must be remembered that the price data and depreciation d 
underlying the corrected price index are derived from independ 
sources and that both ere completely independent of the’ cost d 
underlying the construction cost index. In view of this independene 
derivation, the almost identical long-run movement of the two se 
over four and a half decades argues strongly that the construction 
index measures with quite reasonable accuracy the secular movement 
house prices.!? D 


CONCLUSIONS 


The 22-city price index апа the construction cost index show 
nificant short-term divergences. These suggest that market prices € 
homes fluctuate more widely than construction costs, the differenc 
rise or fall perhaps amounting to as much as 10 per cent in a perio 
several years. For short-term analysis, then, some niargins of error а! 
involved in using the cost index as an approximation of a price ind 

With regard to long-term movements, however, the construct 
cost index conforms very’ closely to the price index, corrected for 
preciation. It would appear, therefore, that for long-terra analysis 
margin of error involved in using the cost index as an approxima’ 

»of a price index cannot be very ртей%. 


10 Only if there were major increases in site productivity not reflected in the construction 
index might this view be questioned. Although data on this question are extremely scanty, there is 


» » 


-CYCLES IN THE BALANCE OF PAYMENTS* 
n 


Богомом FABRICANT 
, New York University 


ooxiNe back at the vast literature’ on-international trade theory, 
Jacob Viner noted in 1937 that the older discussions contained “only 
scattered and incidental references to the repercussions on the inter- 
national mechanism of cyclical fluctuations in business activity” 
«(Studies in the Theory of International Trade, p. 432). But he observed 
also that “within the last few years” the question was being “ more 
seriously fackled;” and, as we now know, at the very time he wrote this 
spark of energy was already Being blown into flame by the lively breath 
of Keynes’s General Theory. In the past decade and a half, however, 
little of this energy has been devoted to the “inductive spadework on 
the international aspects of business fluctuations” that Viner rightly 
feli was one of the steps necessary in a fruitful attempt to “incorporate 
cycle theory into the theóry of international trade ... or to apply 
international trade theory to cycle theory.” 

The statistical analysis by Chang is therefore one of the exceptions. 
Indeed, it is the only sustained attempt to determine in some system- 
atic fashion the characteristic cyclical behavior of the several items in 
the balance of payments of a variety of countries that has been pub- 
lished so far. Such enterprise merits applause—and in this case even 
wonder, for Mr. Chang, apparently workingewithout statistical assist- 
ance, must have spent his days'and his nights with the calculations. 

Necessary perspective om Mr, Chang’s work is provided if we start 
by asking ourselves ‘what “inductive spadewozk bn the international 
aspects of business fluctuations” should attempt to uncover. Suppose-- 
let us drestm—that we have unlinfited time, money, and data. For any- 
country, then, we want to know how each of the several items in its 
balance of payments fluctuates, and how each behaves during domestic 
and foreign business cycles. How well do changes in exports, for exam- 
ple, conform to changes in domestic bfisiness conditions? Do turns in 
exports usually lead, coincide with, or lag behind, turns in domestic 
business, or is their timing irregular; what is the average аш litude of 
fluctuation of exports, and the variation about this average, during 
domestic business cycles; do exports usually rise most vigorously during 
the early stages of domestic business expansion or during the later 
stages, or is there no systematic difference; does “quantum” of exports, 

* A review article of Cyclical Movements in the Balance of Payments, by Tee Chun Chang. Cambridge 
(England): Cambridge University Press. 7051. Pp. x, 224. $3.75. 


© «9^ 


€ 


€ o 


E 


2 


80 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


fluctuate more or less than price of exports, and price of exports more 
or less than price of imports and other goods? Are there any systematic 
differences between the cyclical behavior of exports of raw» materials 
and those of fabricated produets, and if so, what are the relative weights 
of the several classes of goods, and how have long-run changes in these 
weights affected the cyclical behavior of the total? And we ask, also, 
if the behavior of exports seems about the same whether the domestic 
cycle is short or long, mild or severe; and if there are differences in 
cyclical behavior systematically related to the raté of growth of the 
economy or its several parts, or to other secular or structural changes, 
e.g., in tariff barriers and monetary standards, Then, lÓoking over a 
country's borders at business conditions abroad, we consider how the 


latter are related to domestic conditions and to Süctuatións in the items: 


in its balance of payments. Having thus studied cycles in the balance 
of payments of each country—or an adequate sample of countries— 
we would go on to see whether the countries fall into homogeneous 
groups, each with its characteristic type of balance of payments fluctu- 
ation: whether, for example, “industrial countries" differ from “raw- 
material producers,” or industrial countries heavily engaged in export 


production from those in which production for export markets is rela- “ 


tively small. And we would search also for similarities among countries 
with respect to secular change in behavior. E b 

"These questions would be put differently and in: different order by 
a person with other tastes and theoretical predilections, but in one form 
or another they—and many other factual questions—would be in the 
list of every economist &arnestly seeking light. on the international 
aspects of business Huethations. » » 

> To answer these questions—remember, we are dreaming—we would 

use many rather long statistical sertes, and most of these would be on 
a monthly or quarterly basis. We would use a statistical apparatus that 
enabled us to study the shape of individual cycles in each series, as 
well as averages. We would relate developments i in each country to 
those in countries closely tied to it by trade and finance, as well as to 
those in the rest of the world asa whole. ~~ 

It will be no surprise to the reader that Mr. Chang has answered few 
of our questions. Apart from the obvious Teasons, others appear as we 
take stock of what he did. 

For each of six countries (Britain, the United States, Sweden, Aus- 
tralja, Chile, and»Canada) Chang took the annual figures for each 
major item in the balance of payments (except the net gold and capital 
flow) during the period 1924-1938, eliminated trends, and determined 
—one at a time—the linear logarithmic equation relating the fluctua- 


D a 5 
5 
e m 


— MH 


| 
| 
| 


BALANCE OF PAYMENTS 81 


series to those in the presumed causal factors. Reading 
пз, he had the percentage by which each item in the bal- 
ments changes on the average with a one per cent change in 
independent” factors. These factors were then related by 
ess to real “world income” (more exactly, the real income 
of the world). Simple substitutions in the equations gave 
reentage change, in each item which could be expected if 
me rose or fell by one per cent. Applying these percentage 
the figures in a base period, he calculated the corresponding 
ount of change in each item. The net difference among 
hir the absolute change that could be expected in net cap- 
, including gold ard the residual érror. The various equa- 
tabulation of the absolute changes to be expected when real 
me rises or falls by one per cent, constitute his statistical 
‘of the cyclical fluctuations in the balance of payments of 
ts of this process of summarization may be illustrated by 
an figures. The following table gives the regression coeffi- 
the elasticities) : . 


Е Elasticity with respect to ў 
:. R* us World US World US Relative 
Real Real Money Money „Тро Export 
Income Income Income Income ^" Price®  Pride® 


° “©, 
1 9.27 rs -.97 
997 2.92 "us — 438 
-91 1.38 

.98 .94 $ 
1.40 


Ё 
эө 


* eS 


Я LJ = 

of multiple correlation. * ° 

rt price (with tariff) divided bf U.S. cost of living. 5 

Price divided by U.S. competitors' export price. = 
1932 are excluded because Г the abnormal influence of world exchange depreciation.” 


. 
e 
9 У € e * 


€x e 


82 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 10954 


Chang derives the “cyclical pattern of American income account,” 
expressed in millions of dollars of change, by applying these elasticities 
to average values in the base period 1925-1926. For a one per cent rise 
in world real income (and a corresponding 2.15'per cent change in 
U.S. real income) he has: | 


Imports —178.8, 
Exporta 1195.0 
Interest receipts ^F 18.9 
Interest payments 6729, 
Other current items 2514.9: 


Net change in balance -+ 17.8 
(All items rise; as usual, the minvs signs represent debit items.) › 


What we get out of this for each of the six countries is the usual direc- 
tion and average amplitude of fluctuation in each major item in its 


balance of payments relative to a given movement in the “world cycle” - 


(or, if one wishes, in the country’s own real income), and some notion 
of the separate shares of price and quantity variation in import and 
export value change. 

In order to be able to say something about “typical” differences be- 


tween raw-material producers and industrialized countries, Chang sup- , | 


plements these calculations for the sample of six countries with a less 
detailed examination of the datg for another 15 or 16 countries. For 
these countries he determines the first two of the equations given in 
the above list, for the United States, and thus obtains the elasticities 
of quantity of imports with respect to real domestic income and rela- 
tive price (his Table 4) and of quantity of exports with respect to real 
world income and relative price (Table 6). All told, then, he has the 
factors determining (or associated with) the imports of 21 and the ex- 
2 ports of 22 countries, with 19 common to both lists. Again, the period 
is 1924-1938, the data are annual, and the trends have been eliminated. 
Chang finds, with respect to imports, that the price factor is of minor 
importance: all but one of the elasticities are beiow unity, and for 13 
countries they are below -5, neglecting the minus sign. These findings 
tell us something about the extent to which »rice fluctuations are asso- 
ciated with quantity fluctuations (income constant). Chang’s reading 
of them és demand elasticities in the neoclassical sense, however, raises 
questions of the sort that troubled economists in interpreting statistical 
demand curves during the 1920's, Indeed, Chang’s interpretation of his 
results—reproduced from his earlier published articles—has already 
beeh criticized (see, for example, Guy Orcutt's article in the May 1950 
Review of Economics and Statistics). 


v 


5 


IN THE BALANCE OF PAYMENTS 


ang concludes that the important demand # 
i one ‘of the income elasticities are above unity, wi 


ith respect to exports, the price elasticity (again ignoring dn minus | 
n is above unity for 4 countries, between .5 and 1.0 for 9; and below 


y, we find that Table 6 depicts a EA reverse to that of 
ble 4. The countries whose ‘import income elasticity i is less than - 

of the world as a whole are those whose export income elasticity 
eater than that of the world as a whole; and conversely. Or, speak- 
pore generally, for the former cases, the import income elasticity _ 


er cases, the export income elasticity tends to be smaller than the 
üport income elasticity” (p. 51). e 

Chang is ratherecareless here, for he is comparing the import elastic- 
of a country with respect to its own income, and its export elasticity 
п respect to world income. It is somethingeof a jump'to imply here, 
ssert explicitly later (р. 170), that “ће difference in the magni- 
e of import and export income elasticity of all the agricultural coun-, 
tends to‘result inea large and unfavorable айе in relative quan- 
in prosperity. ” Except for the six countries studied in de'ai] 
ng presents no data showing how the income of individual coun-<— —_ 

8 fluctuates if relation to world income. However, the statement 

t change i in relative quangites might be warranted. Agricultural 

ut in the United States (and therefore presumably also purchases 

eh output by domestic “industry”) fluctuates within a narrower 

than does mining “and manufacturing output (and therefore, Guts 
mábly, purchases of such products by farmers). This is probably 

Я trade within other countries, and possibly of international о E» PNE 
V ees 
ng’s elasticity calculations relate to mundi not values. z 

for the six countries he provides no evidence cn the cyclical flue. 
ions in terms of trade. However, it is pretty clear, again from. 
ition outside of Chang’s book, that ces terms of trade change I 


< o : t P €: 


84 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


favor of agricultural communities when business improves. Further, 
Chang shows (for eleven agricultural countries combined, р. 169) that 
the balance of merchandise trade was negative during 1924-1930 and 
positive during 1931-1938. Chang is led to assert, therefore, that 
change in relative quantities tends to more than offset relative price 
change (p. 170). А 

The picture for industrial countries is the opposite, of course. In 
their case, the merchandise export balance rises as world business im- Ч 
proves and falls as world business worsens. $ a 

Inthe case of mining countries, Chang believes the change in relative 
quantities to be small; but the change in relative pricés to be great 
and in favor of these countries as world business expands. Therefore 
their merchandise export balance tends to behave like that of industrial 
countries, d 

Tf all this is true it means (ignoring rhinor items in the balance of pay- 
ments) that net capital export by the industrial and mining countries 
tends to be greater (or net capital import smaller), during world | 
prosperity than during world depression, with the reverse for agricul- | 
tural countries. These generalizations about changes in capital flows 
and associated changes in trade quantities, prices, and values during? | 
fluctuations in world business constitute Chang’s major findings. 

It is clear that the scope of Chang’s results is narrow and that we are | 
far short of having all the answers to the questions listed earlier. Bub 
no, investigator working alone could have gotten very far in filling that 
bill. я 

Another criticism, however, must be made of the adequacy of his 
results. There, too? the conclusion is clear. We are net sure of the 
Answers that Chang has given us. And the grounds for our doubts are 

> = Much the same as those that inevitably restrict the scope ої his results. 
One reason is the limited range of the data analyzed. Whatever ap- | 
paratus is applied in the analysis, it cannot be expected to extract from | 
short series of annual data reliable answers to our questions. It might | 
perhaps be argued that the world economy of pre-1914 days was во  . 
different from that of the inter-war period, that we must make shift р 
with the 1924—1938 data if our concern is with the workings of the — . 
economi world of that period; but this cannot be determined a priori, — 
and I do not believe that it jibes with the known facts. Chang should 
nos and need not have confined himself to the period 1924-1938. Nor 
should he have restricted himself to annual data, for annual data sup- 
press too many significant features of cyclical change. Limits on his 
time and energy would have forced him’ to be less ambitious in other 


—-—- 


D 
> B 3 x 


THE BALANCE OF PAYMENTS 85 


ut I suspect that he chose the inferior alternative. Study 


that arise about his trend eliminations. Chang tells us prac- 
nothing about them. Since we know that his period of fifteen 
short, and that it includes an esceptionally m cycle, we tend 


income, drew & regression line through the fifteen points in the 
reiideoff the slope of the line, and discarded the basic data 
Lh diagram. The Tegression coefficient. (f.e., the elasticity) hardly 
that might be learned from the basic э or even the chart. 
ample, is the slope greatly influenced by the extreme points; 
is the slope a reflection mainly of a single large cycle—that of 
930's—or does it fairly reflect all the cycles including the (two) 
er ones? Chang provides enough scatfered information to raise 
ious doubts about this, but no systematic examination is made 
p information given to enable the reader to undertake it 


р р cgefficients are ЕТЕШ given by Quang. Howes the 
lard errors of.the regression coefficients are conspicuous by their 


а serious deficiency in a eontext in which the problem of mul ЁТ 
ity i is important. , e 


eisa questiorr a also about the uk of straighé lines on double-log* 
‚ Do they always tell the story? They could not, for example, 


à symmetrical cycle in another. 

_ the reader will realize that I am suggesting the use of some such. 
itus as Mitchell's in studying cycles; but. was nôt this apparatus 
by an expert for the very purpose? 

assumes that a eather tightly-knit world economy existed in 


'› but on the next page dismisses this as We oo He segms 
been% victim of his choice of data and method of analysis, 
led him to assume the world-wide diffusion of the depressión of 
в to be characteristic of cycles generally. (A. F. Burns points 


e е 
° 


° 


v 


doquately the relation between an asymmetrical cycleinoné — 


86 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


out that “after 1919 the business cycles of different countries tended to 
drift apart, though practically all shared in the catastrophic contrac- 
tion of 1929-32;" see Papers and Proceedings, American Economic Re- 
view, May 1949, p. 82.) Careful examination even of that episode might 
have raised some doubts in his mind about the assumption of a closely 
integrated world economy. By failing to present data for more than the 
six countries on how the income of individual countries fluctuates in 
relation to world income, he fails to provide the basis for such an 
examination, and fails to present convincing evidence for his assertion 
that the “trade cycle is a world-wide phenomenon” (p. 220). Indeed, 
we know that the peak even before the contraction of the 1930's came 
at different times (even én an annual basis—see Thorp’s business an- 
nals for 1926-1931, News-Bulletin of the National Bureau of Economie 
Research, Sept. 1932); and Chang’s equations for the six countries sug- 
gest how greatly contractions have vaiied in Severity (the “elasticity” 
of national real income with respect to world real income ranged from 
58 for the U. К. to 2.15 fòr the U.8.). 

Аз part and parcel of the above assumption Chang assumes that 
there is a single world market for all export countries and that world | 
income is the dominant demand factor with respect to each. However, . 
the “world market” is a group of markets, closely interrelated for some 
Commodities, loosely for others: recall A. J. Brown’s observations (in 
Chapter VI of his Applied Economics), and Charig himself notes that 
countries have, “exclusive” ‘markets and that “world market is an am- 
biguous notion” (pp. 53, 70). Since the incomes of the various importing 
countries do not in fact fluctüate identically, we cànnot expect that all. 
exporting countries» will be tonfronted with the same demand condi- 
tions. The aggregate of world income is therefore hardly an appropriate 


_.theasure of the strength of the demand confronting апу individual 


CYCLES IN THE BALANCE OF PAYMENTS * 87 


Mitchell’s What Happens during Business Cycles, p. 58, footnote, and 
his earlier 1913 report], will contribute to diversity of national income 
experience among the countries of the world.) 

There are some questions, finally, about the accuracy of Chang’s 
basic data, as well as the combinations-he made of them. Students of 


income statistics will wonder about the adequacy of the real income 
estimates for the score of countries covered. Others will raise their eye- - 


brows at some of the series on quantum and price of imports and ex- 
ports. Chang mentjons sources but does not go sufficiently into the de- 
tails of the construction of the estimates or their adequacy for his 
purposes; por.does he indicate what change in his findings might result 
were use made of alterfiative estimates, wheré these are available. 
of Chang’s findings for 1924—1938 to other periods, but also about their 
adequacy for the period he covered. 

The chapter in Viner’s book to which reference was made above 


opened with a statement unearthed by him"from a work published in — 


1857: “Many writers have perplexed themselves and their readers by 


founding theories on exceptional circumstances. Others have been led - 


astray by statistics—the characteristic form of modern research.” 
Chang’s book illustrates both dangers. Yet we should not forget that 
his study is the first attempt at a systematic survey of cyclical moye- 
ment8 in the balance of payments of a wide Variety of countries. His 
energy—even his boldness—in grappling with stubborn facts sets us 
àn example. While we cannot fallow in his footsteps all the way, we 
may discover the right path more easily becguse of his pioneering ef- 


forts, А а 5 
в А у 
* ° 
° 
© e 
e 
; e e с 
е 
- 
e cae 
• e 
* 
e 
а e 
e. 
А 
ee 
e E 
° 
$ 9 М > we 


We are left, at the end, with doubts not only about the applicability 5 


DEMAND ANALYSIS* 


Н. 8. HovTHAKKER 
University of Chicago ? 


N A field where papers are plentiful but books are rare the appearance 
I of a monograph by one of the leaders in its development raises high 
expectations. Professor Wold is perhaps best known as a mathematical 
statistician, but he has also made valuable contributions to economic 
theory [15] and his earlier empirical study on the demand for farh 
products [14] would no doubt be generally recognized as a classic if 
its language and time of publication had not curtailed its Vireulation, 
For the present work he associated himself with Mr. Jureén, a govern- 
ment statistician; he has also drawn upon the support of a number of: 
other distinguished collaborators in Sweden and elsewhere. It should 
be said at once, lest the following criticisms obscure the appreciation, 
that the hopes thus raised, are not disappointed. Demand Analysis is a 
highly instructive and provocative work that no economist or statisti- 


к cian could consult without profit, and for specialists it is indispensable. 


In common with other branches of econometrics the study of con- 
sumers’ behavior requires knowledge of the relevant chapters of eco~ 
nomic theory and statistics in addition to skill in the interpretation 
and utilization of observational data. The book under review is organ- 
ized around these three elements. After a first part which surveys the 
subject and summarizes;the results there follow sections on the theory 
of choice, on stochastic processes, arid on regression analysis: finally, 
, some empirical investigations of the demand for foodstuffs in Sweden 
* are discussed. Thrée.of the five parts are completed by aumerous exer- 
,cises, some of which contain interesting new results. 

$ y 
ECONOMIC THEORY 5 a 

The purpose of theory in an “applied” subject, consists mainly in the 
formulation of (a) concepts i in terms of which observations can be use- 
fully described and (b) theoréms, phrased in those terms, that allow 
statements about situations for which adequate observations are lack- 
ing. The usefulness of these concepts depends on their maintaining à 
certain invariance between different situations; thus, if the quantities 
bought by consumers showed little o? no correlation with prices and 
ingomes the concept of a “demand function" would not be a useful one. 


a A review article of Demand Analysis: A Study in Econometrics, by Herman Wold in association j 
with Lars Jureén. New York: John Wiley and Sons, Stockholm: Almquist and Wiksel, 1953. Pp. xvi, 
358; $7.00. 


» 88, » 


DEMAND ANALYSIS 89 


Professor Wold is therefore rightly concerned to show from empirical 
evidence that demand functions, which are the cornerstone of consump- 
tion theary, do possess such stability. To what extent he has in fact 
shown this is a question to which we shall have to return. 

The formulation of demand functions is not the ultimate aim of 
what the author describes as the “Paretoan” theory of consumer de- 
mand, though it was as far as the “demand function approach” (G. 
Cassel [3]) was prepared to go. In order to state theorems about these 

„functions additional assumptions have to be made. These assumptions, 
which have been expressed in different ways, link the demand functions 
for an individual consymer with his preferences for various collections 
of goods. The now classical 4pproach, due to Pareto and also favored 

e by Wold, attributes to the consumer a consistent preference ordering 
for all such collections. A more recent version, advanced by Allen [1], 
assumes that the consumer only*compares collections that are very close 

. to each other; this, however, is not a genuine generalization, for as 
soon as there is a finite différence between compared collections a chain — 
of comparisons can be made and we are back to the preference ordering 
approach. A third approach, originally proposed by Samuelson [10], 

* expresses consistency of preferences directly as a property of the de- 
mand functions; the reviewer has shown [6] that this “revealed prefer- 
ence? approach, when appropriately formulated, às also equivalent to 
the classical approach. 

It has appeared worth while to go through these theoretical points 
because Professor Wold’s discussion may easily mislead the unsuspect- 
ing reader, His theórem 4.6.1, in fact, seems: to assert not only е, 
equivalence of the nfyrginal substitution ‘and ‘preference ordering ap- 
proaches, but also of the latter and Cassel's demand function approach, 

In other werds, the author does notgregard the assumption that demand == 
funetionts are deráved from éonsistent preferences as an additional one; 
in his view these functions, lest they be “self-contradictory,” must 
always satisfy the so-called “integrability condition," which expresses 
this Consistency. Similarly, he declares the revealed preference ap- 
Proach to be merely a vatiation of the demand function approach. In 
doing so-he misinterprets both, for it has been shown [6] that the strong 
Prom of?revealed preference, which is certainly not satisfied by an 
a bitrary set of demang functiens, is a necessary and sufficient condi- 
Ln for the existence of a consistent preference ordering. Using woxds 


i il i Я ‚ е 
Bren м ascribes this “marginal substitutiog” approach also to Hicks, but this is very questionable. 
Views of M as the well-known pair of papers by Hicks and Allen of 1934 [5] a discrepancy between the 

e two authors was evident. е 3 


e A Ф 


є ө 


90 i AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 19 


in their usual meanings there is nothing “self-contradictory” in a set of 
demand functions for which the integrability condition does not hold; 
all one can say is that the notion of preference does not apply to such a 
set. Wold's theorem 4.6.1 is therefore based on a petitio principii. If it 
holds true for the marginal substitution approach, this is only for the 
reasons given in the previous paragraph, and not, because of Wold's 
cireular argument. з 7 

In any review more space is inevitably devoted to criticism than to 
commendation, and we add at once that apart from this slip? the ate, 
thor's exposition of the preference ordering approach in Chapter 4 is 
lucid and original. Chapters 5, 6, and 7, dealing with the specification 
of demand patterns, relations between démand elasticities and market 
demand, lift consumption theory above the formal level on which it is 
too often discussed. Still more stress might have been laid, however,on - 
the preponderance of corner equilibria’and the resulting restrictions on 
the validity of iraditiong] calculus methods. Curiously enough the 
author has failed to see that Hicks’ method of deriving market demand 
is exactly the same as his own, so that his criticisms are unfounded. 
The exercise 2.27, which is tlaimed to illustrate Wold's objection, is 
highly instructive nevertheless. In Chapter 8 applications of preference? 
theory to the supply of labor, to barter and to price index numbers are 

iscussed. gue ct i o 2 ^ 

Although Professor Wold points out and evaluates some of the 
limitations to»Pazetoan demand theory resulting from its static, non= ` 
stochastic and individualistic character, there is very little discussion of _ 
a possibly more serious cumplicasion, viz. that arising from consumers! 
assets. These assets, awirtieularly in the form of durablo consumption 
goods, lead to indivisibility problems and to the explicit introduction of 
time into the budget (see the recent’work of Theil [13] and also Bould- 
ing [2]). Formally these problems may be covered Еу an exténsion of 
Wold’s axioms, but in practice this is not very helpful; in fact the most 
difficult and interesting questions of theoretical and empirical demand 
research are precisely in this urea. In a book entitled Demand Analy- — 
sis readers should at least have been made aware of this field and 
referred, for instance, to the investigations of De Wolff [4] and Roos 
and Von Szeliski [9] on automobile demand. The preoccupation of the 
empirical chapters with food demand, though otherwise understand- 
able, may also leave students with exaggerated notions as to the scope 
of 4 purely static approach. - y 


mo м... Du—————— AU 


* It might have been avoided if the author had paid more attention to Samuelson’s questioning [11] 
of a similar theorem in [15]. As will be seen from the above Wold is also incorrect in describing the result 
in [6] ав mathematically equivalent to his own assertion on the demand function approach, 


2 
Jw » 
» 


[AND ANALYSIS 


E. STATISTICAL METHODS 5 Fcc eae 
statistical controversies in demand analysis are perhaps of more —— 
ance than the theoretical questions reviewed above, since the 
have a greater bearing on the numerical results obtained. Early _ 
history of econometrics it was recognized that a statistical ap- . 
designed mainly for biological experiments may not be entirely — 
le in the analysis of economic observations. Two problems in 
ular have given rise to an extensive literature: serial interdepend- | 
› їп time-series' and estimation in a system of simultaneous rela- 
On both of these topics, as well as on some related ones, Professor - ET 
has much of intexest to contribute. з n 
discussion of time-series problems is based on a condensed but- 
nt exposition of the theory of stationary processes, including & 
ption of recent work by P. Whittle. He uses these results to show j 
in certain important cases “lassical least-squares methods retain — 
optimal properties in large samples, thoygh the traditionally cabo 
lated standard errors may no longer indicate the goodness of fit. The 
ial case to which the author devotes most attention is that ofa 
ursive system. This is a system of behavior equations which can be 
E р rather than simultaneously. An example is the "pig. 
le” model ^ : 


d = D(p) Я в; = S(p) p^ Dea + A (dias = 81-1) 


le d, is demand, s, supply, and p, price at*time te 8 аас 
fessor Wold maintains—or at any rate leaves his reader with the — | 
ession—that such systems can be almost universally applied їп, T 


er development do not exist in recursive systems; more 
4 Coefficients in all their equations are identified and their least- 
es estimates are asymptotically unbiased. Я ale, 
re are clearly desirable ptoperties, and it is therefore necessary . a 
Onsider how wide the scope of recursive systems in fact is. Strictly — 
aking they are indeed universal: there is every reason to believe that —— 
onomy, to use an anthropofnorphic simile, solves ite. simultaneous 
ations by trial and errof. Individuals adjust their actions to — 


© 9 s G « 


е f T ED 14 


92 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


parameters which they regard as fixed, but which are subsequently 
themselves affected by these actions so that new adjustments are re- 
quired. Static models in which equilibrium values determine each other 
without sequence or lags are an abstraction, and have been recognized 
as such for almost as long as they have been used. This does not mean 
that they are useless, nor even that they are less realistic than recur- 
sive systems, in which such lags occur explicitly. 
The problem here is that these lags are of very different length, rang- 
ing from a few seconds for the response of share or commodity prices 
to shifts in excess demand all the way to severdil years fer the produc- 
tion of new ships or roads. In recursive systems of the kind described 
by Wold, however, lags have to be integral multiples of some unit pe- 
riod, usually the period to which the observations refer. If, as is usual 
in econometries, annual observations have to be used, а one-year or 
two-year lag can easily be fitted in, but the question of what exactly 
should be done with lags, of other lengths has not received Professor 
Wold’s attention. In the absence of such a discussion his stress on re- 
cursive systems does not carry complete conviction. 
We may perhaps detect here, as in other places, the results of an 
insufficient analysis of the use of approximative theoretical models in’ 
empirical research. The logic of regression analysis is treated with its 
application to experimental data as a prototype; with the aid of this 
interpretation rules for the selection of dependent variables are given. 
‘There is much play with,the notion of causality, even though it has long 
lost its former pre-eminence in the physical and biological sciences and 
has never been very popular with economists, who tend to think in 
" terms of functional rather than causal relationships.’ Its introduction 
in any case hardly helps to bring out the difficulties peculiar to infer- 
ence from non-experimental observations, arising mainly from the fact 
that the latter usually have to be taken af they como and are'available 
in limited number only. Models therefore haye to be chosen with 
reference to thé data with which they are to be used. The resulting 
problem of how to choose between models is nowhere faced squarely in 
Wold's work, though there is a somewhat indonclusive discussion of the 
effects of additional regressors on the estimates of parameters already 
taken irto account. 

What is lacking, to put it in other words, is an adequate treatment 
of, small-sample estimation. This compaint is of course not addressed 
to Professor Wold alone, for after Student/s and Fisher's classical con- 


° 


з Recently an attempt to rehabilitate the notion of causality has been made by Simon [8], whose 
point of view is very similar to that of Wold. а 


DEMAND ANALYSIS 93 
tributions progress in this important area has been disappointing. Con- 
temporary interest in estimation problems seems to be mainly centered 
on estimators with various asymptotic properties whose practical use- 
fulness is often hard to see. Under these circumstances the author is 
unfortunately right in stating that for small samples we have to be 
satisfied with “the rough inference drawn by the use of large-sample 
methods,” but this Should not imply an abandonment of the search for 
more appropriate procedures. 
. These comments are mostly elicited by Wold’s stimulating Chapter 
2, inwhich he attempts to show that least-squares regression, despite 
the objections from Oslo and Chicago, is “essentially sound.” The 
words in quotes are in fact characteristic of hi8 attitude of militant con- 
,servatism on inany points of controversy in statistics and elsewhere. 
Not all readers will find their doubts quieted by the forceful but occa- 
sionally one-sided array of arguments, but they will learn a great deal 
from trying to refute them. 


Е 


EMPIRICAL FINDINGS 

The empirical part, for which Mr. Jureén was jointly responsible, 

, describes an extensive investigation of food demand in Sweden under- 
taken in connection with an inquiry into the long-term position of 
Swedish agriculture. Both family büdgets and market statistics are 
used as source material. In line with Professof Wold’s views on statisti- 
cal methods as discussed above only a few technological innovations 
are to be noted. The numerical.results, subject to the validity of the 
techniques employed, аге on the whole very,reasonable and their dis- 
cussion, again with this proviso, is competent ang illuminating. Oure 
qualification refers to the authors’ neglect of the supply side when deal- 
Ing with demand equations, but this neglect is of course deliberate. * 


«.. The work on family budgets (Clíapter 16) is based on three Swedish ~~ 


Surveys dating from 1913, 1923, and 1933. It is shown that estimates of 
i , 3 " 
m food consumption obtained by blowing up averages from the 
ae Survey agree fairly closely with inglependent estimates of market 
ЕЯ This is the moresremarkable because the sample was not ran- 
à wh le * guntaty and participants had to keep detailed accounts for 
B adhi year. It is in fact by no means clear that the voluntary ap- 
tabl: 18 really inferior to the yandom sampling methods (with inevi- 
A low response rates) currently in vogue. 
able ЕК these budgets income elasticities are estimated for a consider- 
ture umber of commodities ang family types, quantity and expefidi- 
elasticities being distinguished. In most of the analyses constant- 


» LI 


° 


= 9 E e А е 


. according to Slutsky and Hotelling is tested; it is found to hold rather 


94 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH T 


elasticity formulas are applied, but since the results reveal that 
elasticities are not independent of income, some use is also made | 
the group of formulas suggested by Tórnqvist. No standard erro 
the estimates are calculated, on the ground that they are not theo: 
cally justified for this material.’ 

Earlier in the book (Chapter 14) there is a short discussion of equi 
lent adult scales where a method for determining the weights is p: 
posed, This consists in calculating income elasticities in two ways: 
pooling the separate estimates for different family types and by deriv y 
ing a joint estimate for all households after their expenditures ha 
been divided by the relevant number of equivalent adults. 1 the sca 
is correct, the two calculations will yielc the same result. This agrees 
ment, however, is in general only a necessary and not a sufficient con=, 
dition for correctness. Except in the case where only two kinds of pere 
sons (children and adults, for instanee) are taken into account nO 
unique scale will be obtained by Wold's method, and in practice it 
desirable to specify many more categories of persons. Moreover, ай 
though Wold recognizes that the scale for total expenditures should 
be different from the scales«for particular items this point does not 
seem to be allowed for in his method of computation. Ж 

Professor Wold’s views on the relation between the income elastici- 
ties estimated from family budgets and those estimated from time series 
are also worth noting, especially since he was (in [14]) the first to come 
bine the two souyces in the manner now widely adopted. He disti 
guishes between short term and long term elasticities and maintains that 
the two sources both estimate the latter yariety, which is usually the | 

"more interesting one. Because of the continuoug introdnetion of new 
commodities, however, he thinks that the elasticities obtained from 
budget data on the whole tend to he smaller than those that refer to. 
market statistics. This is an interesting Observation, but it is not 80. 
clear that the time-series elasticity is really a long term figure and this 
somewhat weakéns the author's conjecture. 

In their work on market statistics (Chapter 17) Messrs. Jureén and. 
Wold frequently use “conditional” regression analysis with income | 
elasticities inserted as if they were known from other sources;.they do 
not always use the estimates obtained from budget data but supple- 
ment these by “common sense” arguments. Sometimes they fix the 
Price elasticity instead of the income elasticity. Standard errors of the 
estimates are calculated by means of a new formula which allows for ^ 
autocorrelated disturbances. The symmetry of cross-price elasticities 


DEMAND ANALYSIS 5 


strikingly in the case of pork and beef, but much less so for animal and 
vegetable foodstuffs. р 
Some more general points in market demand are discussed in Chap- 

ter 15. Wold there pronounces himself against trend removal but in 
favor of deflating prices and incomes by a general price index. With the 
first advice the reviewer agrees, and, at any rate in the case of food- 
stuffs, also with the second. The author's argument on the latter ques- 
tion, however, is very superficial. He recommends deflation to correct 
for “changes in the monetary unit,” and refers to Schultz [12] in sup- 
port; Schultz, on the &ontrary, advocated deflation because it increases 
the degrees of freedom,by one even though there is no exact method of 
taking changes in other pricés into account. This is a much sounder 

.argument, which shows incidentally that deflation is not invariably 
appropriate whereas Wold implies that it should always be applied. 

In the final Chapter 18 the mést interesting contribution is a detailed 

forecast of 1949-50 food consumption on the basis of pre-war demand 
functions tested against actual consumption in the forecast period. It 
is shown that the forecasts are on the whole reasonably accurate, and 
that in most cases they are nearer to the observations than the pre-war 

* average on which a “naive” forecast might be based. It would be cap- 
tious to deny that this satisfactory result speaks well for the validity ‘of 
the methods used, whatever doubts one may have about their theoreti- 
cal justification. « ' 


. = e 


CONCLUSION 


According to the preface Demand, Analysis “is written in ће dual 
form of a research report and a specialized textbook of econometrics.” ® 
There are advantages and dangers in such a combination, particularly 
for the textbook half, and both are conspicuous here. The main adyan? . 
tage is that the methods discussed ‘can be illustrated by actual applica- 
tions though this is perhaps more effective if these applications belong 
to several fields instead of to £ single rathér narrow Gne as is the case 
here. Moreover readers will look in vain.for actual applications of most 
of the economic theory ard much of the statistics discussed in the earlier 
Parts of the book. The dangers of the dual form are even more appar- 
ent. It has been pointed out already that the°preoccupation ith food 
demand in the empirical sections has led to an unfortunate neglect of E 
dynamic factors. If industrial commodities had been studied as well as 
Agricultural ones, there might also have been less inclination to ignore 
the complications due to simultaneous equations. . x 
Аз а textbook Demand Analysis therefore has serious Jimitations, 
3 C = 


с 


З ө 
= а А z ч . ж m 
n rs 


96 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


which means that its use as such requires a considerable amount of 
additional explanation and amendment. The arrangement of the mate- 
rial could also be improved and much repetition eliminated. Professor 
Wold could hardly be blamed for not writing a standard work, since 
the subject is still too young to:admit of one, but he would have come 
closer to writing one if he had been as successful in interpreting the 
ideas of others as he is in expounding his own. What we have here is 
essentially an admirable statement of the opinions and methods favored 


by one expert for his own research interests, and ag such the book is _ 


an occasion for unqualified gratitude. ? х : 


о REFERENCES, р, 


Ш Allen, R. G. D., Mathematical Analysis for Economists. London, 1938. 

[2] Boulding, K. E., A Reconstruction of Economics. New York, 1950. 

[3] Cassel, G., Theorelische Sozialékonomie. Leipzig, 1918. 

[4] De Wolff, P., “The demand for passenger cars in the U. S.," Econometrica, 
Vol. 6 (1038), 113-29. 

[5] Hicks, J. R., and Allen, юс. D., “А FOCUS AE AOI of the theory of value," 
Economica N. S. 1 (1934), 52-75, 196-219. 

[6] Houthakker, H. S., “Revealed preference and the utility function," Eco- 
nomica N.S. 17 (1950), 159-24. 


[7] Koopmans, T. C. (ed.), Statistical Inference in Dynamic Economic Models, ° 


New York, 1950. 
[8] Koopmans, T. C., and Но; W? C., Studies in Meque Method. New 
York, 1953. 
[9] Roos, C. F., and Von MN V., The Dynamics ji "Automobile Demand. 
» New York, 4939. > 
[10] Sarnucleon; P. A., “Note on the purè theory of consumer's behavior,” 
Economica, N.S, 5 (19385, 5Ё-71., 
111] Samuelson, P. A., “The integrability problem i in utility theory,” Economica, 
N.S. 17 (1950), 355285, 
112] Schultz, H., The Theory and Measurement of Demand, Chicago, 1938. 
[13] Theil, H., The Influence of Stocks on? ? Consumer" з демше (in Dutch). Am- 
sterdam, 1951. 
[14] Wold, H., The Demand for Agricultural Products and Its Sensitivity to Price 
and ane Changes (in SWedish). Stockolm, 1940. 
[15] Wold, H., “A synthesis of pure demand analysis," Skandinavisk Aktuarietid- 
skrift, 26 ‘(1943), 85-118, 220-63 and 27 (1944), 69-120. 


ә 


SOME PRACTICAL TECHNIQUES IN SERIAL 
NUMBER ANALYSIS 


Іко A. GOODMAN 
University of Chicago 


The problem discussed is that of sampling from continuous 

‹ А ahd discrete uniform distributions. An application of this 

^ problem is presented which deals with the analysis of serial 

© numbers on manufactured items in order to estimate the total 

number of items manufactured. Estimates of bounded relative 

error are obtained. Some justification for the use of these 

estimates is presented from the loss (cost) function point of 

view. Confidence intervals for the parameters are obtained 

and graphs are presented which may be used to determine the 

sample size required for confidence intervals of a given ex- 

pected relative length. Tests di hypotheses are discussed. A 

method is presented for determining whether the serial num- 

bers obtained are a random sample from a population of con- 
secutive serial numbers." 


CONTENTS 
| • Page 
{ 1, INsnopvomow . ү. *................з-®- NECI LIS 98 
2, SuMMary.............. 98 
3. CONTINUOUS VARIATION... .. se t e 99 
3.1. Initial Number Кпозп....*..............-.... 99 
3.1.1. Confidence intervals. . 22009 
3.1.2. Testing Rypotheses.......... . 100 
3.1.3, Estimates of bounded relative error. : 101 
3.14. Tests of randomness and consecutive serial numbering. ..... 108 
3.2. Initia] Number Unknown....? rore o M ылуу ПАР; . 105 <7 
; 3.2.1. Cowfidence intervals. . "red 
7 8.2.2, Testing Һуроёћевев............... нетна 106 
3.2.3. Estimates of bounded relative erfor....... en 106 
Aj 3.24. Tests of randomness and consecutive serial numbering...... 108 
' Ium Exact MopEL 109 


e 


- AN APPLICATION... 
б. Rurtrences 


. e CHART 
1 51. Sample Cumulative Distribution of the 29 Observed Serial Numbers, 


between the Smallest and Largest E 


RU author is indebted to Mr. William A. Aronow, New Holland Machine Company, and to 
1 Harry V. Roberts, University of Chicgo, for very helpful commenta, 


о в 


a е e e = ° = 
1 P с 


98 AMERICAN STATISTICAL ASSOCIATION Ji OURNAL, MARCH 1954 
1. INTRODUCTION 


HE analysis of serial numbers has several practical applications, 
“Pe shall describe two such uses. The interested reader will no 
no doubt think of still other applications. ! 

a) A commercial company could use the methods of serial number 
analysis in order to estimate the production and capacity of its com- 
petititors. Representatives from the compaiiy could obtain the serial 
numbers of showroom equipment as well as equipment in use which has 
been produced by the competitors. Many of the, basic methods have 
been developed for analyzing the seria] numbers obtained by, the com- 
pany representatives (see: [3]). | 

b) An organization has been using equipment which was purchased 
many years ago. The question was raised as to how many pieces of 

‚ equipment had been purchased. No records were immediately available 
to determine the total purchase, since the purchase had been made 
years ago. Since serial numbers had been placed on each piece of equip- 
ment at the time of purchase, the serial numbers obtained from a 
sample of the equipment could be used to estimate the total purchase, 
Section 5 describes how this method was used to estimate the total 


number of pieces of equipment (desks, bookcases, etc.) which were ' 


purchased for the Division of the Social Sciences, The University of 
Chicago. E 2 


Я М > 2. SUMMARY 


Some of the practical problems which are of importance to organiza- 
tions using “serial numbet analysis” will be considered here. 

The arithmetic involved in the analysis of ser'al numbers seems to 
he simpler if the unknown production p is “assumed so large that 
variation is continuous” (see [3], р" 629). Some results for the “con- 
tinuous variation” case will be presented which will serve as an’approx- 
imation to the exact results, Some exact results will then be discussed. 

The problem of obtaining confidence intervals for the total produc- 
tion p is studied. The sample size necessary to obtain confidence inter- 
vals of a given average relative length is determined, The power of 
tests of hypotheses concerning the true value of the production is also 
examined, 

Rather than use an estimate of the production p which is unbiased 
or which minimizes the average of the squared error (see [3]) it might 
be desirable to have an estimate of which we are “almost certain” that 
it will be no more than, say, 1,2 times p and no less than, say, 0.8p. 
The estimate which maximizes the probability of being included in the 


a 


L NUMBER ANALYSIS 


d interval may be determined. For example, if d is the 
ween the largest and smallest serial number in a sample о 
erial numbers, then we can be “99.99% confident” that the esti- - 
1.20d will be between 0.8p and 1.2p. In other words, we can be 
% confident” that the relative error of the estimate 1.20d of p 
than .2. Justification of the use of such estimates of “boun 
itive error” is presented within the framework of the theory of 
istical decisions. А S 3 
| method is algo presented for testing the basic assumptions made 
erial number analysis by examining the serial numbers which have | 
I obtained. It is possible to test the hypothesis that the serial num- 
obtained are a random sample. This method may also be used to ^. 
t whether there is a change in the procedure of serial numbering. 
application of the methods described herein is discussed in the 
ction. à 


ал229 e 
3. CONTINUOUS VARIATION 


this section we shall assume that the serial numbers have a con- 
Hous uniform distribution between the initial serial number s and 
al serial number s+p, where the total production p is unknown. ` 
the case when the initial serial number s is known and also when 
nknown will be considered, ee : 


Initial Number Known ais voe Ta 


hen the initial number s is known, we might subtract s from each 
number obtained. The serial numbers (after the subtraction has, 
made) will thericbe uniformly distributed between 0 and p. The 
ction p will be estimated using а sample of n serial numbers. а 
М1. Confidence intervals. Let ûs first consider the problem of ob- 
hg Confidence intervals for р. If g is the largest serial number ob- 

› Suppose we state that “the total production p is between g and 
where ais some constant greater than 1. Then the probability that 

tement will be incorrect is 1 /a^: That is, such a statement will 
Correct if and only if ag<p(g<p/a). If n=1, the probability that 
a is fedr /p=1/a. Since each observation is independent, the 
bility that all observations, and therefore g in particular will be К 
han p/a is 1/а", This probability 1/a"=a of making an incorrect ea 
ment may be made small by choosing a large value for the cen- 
0, or by'obtaining a large sample of n serial nambers. We might | 
etermine how small the ‘probability æ of making an incorrect — 
ment should be, and then determine a or n from the relation — 


E. 
° 
e @ e ч * Se. 


d We €: 


100 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


a=1/a". The interval “g to ag” in which it is stated that p lies is called 
the “(1—a)-100% confidence interval” since the probability is 1—a 
that the statement will be correct. ; 

The length of the confidence interval in which it is stated that p lies 
is ag—g —g(a— 1). Since the expected value of g is pn/ (n4-1), the ex- 
pected length of the interval is pn(a—1)/(n+1). The expected relative 
length of the interval is n(a—1)/(n+1) =\.,We might first determine 
how small the probability a of making an incorrect statement should 
be and also how small the expected relative length X of the confidence 
interval should be. The sample size n of the serial numbers may then 
be determined by the relations ў 


"(a — 1)/(n + 1) 24 and а = 1/а" or 
а= +1 + Ал and а= о, where zx = 1/п. 


For апу given values of о and À, graphs of the functions \++1-+Az and 
a-* can be drawn. The value те of = where the two graphs intersect is 
then the desired solution of the last two equations. The reciprocal 
1/ж=т of this solution is the desired sample size. If then no serial 
numbers are obtained, we will have (1—a)-100% confidence in the 
statement that “p lies between g and ag.” The expected relative length 
of this confidence interval is the desired value A. : 

It is interesting to note that among all (1 —o) - 100% confidence inter- 
vals of the form “mg to ag,” where 1 Sa;«a;, the confidence interval 
with the smallest average length is obtained by taking a; =1, which is 
what we have done. аө лу A d 
* 8.1.2. Testing hypotheses. Let us now consider the problem of testing 
the hypothesis that the total production is a given value po. This 
hypothesis will be rejected when the. given value po does not lie within 
the confidence interval. In other words, having observed a sample of 
serial numbers, we make a confidence statement that “p is between g 
and ag," and rejéct the “null” hypothesis that the total production is a 
given value ро if this value lies outside the confidence interval. The 
probability is a=1/a" of rejecting this hypethesis when it is in fact 
true. We should like the probability of rejecting the null hyvothesis 
(that the*total production is po) to be large, when the hypothesis is in 
fact false (i.e., when the total production is a.value p different from 
Po); This probability 1—8 of correctly rejecting the null hypothesis, 
when in fact the true production is p, may be determined by the fol- 
lowing formula: » 


SERIAL NUMBER ANALYSIS 101 
1, when p < po/a 
1 —B(p) = 1a(po/p)" when po/a S p S po 
1 — (1 — e)po/p), when p> ро. 


We call 1 —8(p) the power function of the test. 

The formula for е power function 1 — 8(р) follows directly from the 
following considerations. The null hypothesis that the total production 
is a given value до will be rejected whenever po<g or po>ag. But 

'g«mpy/a if and only ifall observations are less than po/a. The probabil- 
ity that an observation will be less than po/a is po/ap, when in fact the 
true production is p> o/a. Hence the probability that all observations 

, Will be less than po/a (i.e., g « po/a), is (po/ap)"=(po/p)"a if p>po/a. 
If p<po/a, rejection of the null hypothesis is certain since g € p « po/a. 
The probability that at least one observation will be greater than po 
(Le., g> po) is zero for p < po, and it is 1 — (po/p)" for p> po. From these 
conclusions the formula for, the power function follows directly. 

We might first determine how small the probability a of incor- 
rectly rejecting the null hypothesis should be and also how large the 

a Probability 1—8 should be of correctly rejecting the null hypothesis 
when a particular alternative hypothesis p=p; (different from po) is 
true. If the alternate hypothesis p=; has been specified the appropri- 
ate Sample size of the serial numbers required can be determined by 
solving the equation M 


EN o &(p) 
for the value of n. For exaniple, if рб 5р <ро, еп Я 
[3 x 
1 — В = a(po/p:)" E 
‘gh: (1 = 8)/« = (po/p:)" 
or е 
т = log [(1 — 8)/4]/log [ро/рї]. i 


3.1.3. Estimates of bounded relative error. In [3], the problem of point 
estimation of p was considered and the unbiased estimate of p which 
had the Smallest variance was given. The relation between thigyinbiased 
estimate and various other point estimates of p was examined. The 
a of point estimation will now be considered from a somewhat 
n p point of view. We might want to be *almost certain" that the 
е of production p obtained from the sample © n serial numbers _ 

not be more than 1.2 times as large as the true production p, and 


е е 
e 


e е ‹ А e- 


102 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARI 


will not be smaller than 0. 8p. Tf the estimate is of the form cg, 
c21 is a constant and g is the largest among the n serial numb 2 
then the probability that the estimate cg will lie between 0.8p and 1! 
is : E 

(1.2/c)" — (0.8/c)", when cz 12 


and | 


1—(08/)" when ¢ 1.2. . 


Hence the probability that cg will lie between 0.8p and 1.2p is а 
mized when с=1.2 and, in that case, the probability is 


1 — (0.8/1.2)". 


The sample size n necessary in order that we can be “(1—a) -10 
confident" that 1.2g lies between 0. 8p. and 1.2p is determined by 
relation - > 


1—«4-1-— (0.8/1.)* 


n= logto/log:(0. 8/1.2). 


It may be desirable to determine an interval суў to cg (1 S ci Ste 
which we can be. at least. (ites o): eue confident? that any giv 


size n pai be greater than ae Ho (0.8/1. 2). у that case, the val 
of c; and c; are determined by 


Д 1 — æ = 1 © (0.8/61)" 
and ; 
1—a [1.25 = (0.)^]/a"; since e Sc с. 


We might wish to determine an interval cig to cag of which we cal 

be “(1—a)-100% confident” that the entire interval will lie between 
`0.8р andel.2p. If n>log"«/log (0.8/1.2), appropriate values of c; «1:2 
and c,7 1.2 сап be determined by the relation P 


- (1.2/c4)" — (0.8/cs)" = 1 — a. 


a 3 
More generally, if an estimate cg is desired which maximizes tht 
probability of being included between kip and Кур (where the k’s а 


о ° 


a ГИР 


SERIAL NUMBER ANALYSIS 103 


given constants such that А, <) then the estimate should be kp. If 
the sample size n is greater than log a/log (1;/kz), then the probability 
is at least 1 —o that any given estimate of the form g times a given 
constant in the interval cg and eg will lie between kıp and kap, where 


а" = k/a 
and 
А с" = [k^ = и" ]/(1 nf a). 


Also, the probability is 1—a that the entire interval cag to суў will lie 


between hp and kap vehere 


e ` 


(ka/c4)” — (kı/c:)" = 1 — a. 


In practice it may sometimes be possible to determine the constants 
kı and Ё so that if the estimate } of p is between kıp and kp it will be 
“close enough." By “close enough” we mesh that no loss is incurred 
when an estimate $ of p is made which is between Кур and kp. When 
the estimate ф is not between kıp and kap, then the loss incurred in 
using an estimate which is not “close enough” may be some given con- 
stant, say, 1. If the loss incurred in estimating p by Ё may in fact be 
described by the function ы 


à S y 
Lig We f when M < < м, 
1 otherwise, 

then the estimate which maximizes the thatace of being included be- 
tween kip and kp alst minimizes the expected loss: Hence the estimate 
lag which maximizes the chance of being included between kip and kap 
may be justified within the framework of the theory of statistical 
decisions. For a more general discussion of the problem treated in this 
paragraph the reader is referred to [2]. 

3.1.4. Tests of randomness dnd consecutive serial numbering. Tt has 
been assumed herein that the n serial numbers obtained are a random 
Sample from all the serfal numbers which are distributed uniformly 
(numbered consecutively) between the initial serial number s and the 
final serial number в-Ер, where s or s+p (or both) may be ünknown. 
Before applying the statistical-methods which have been based on this 
assumption, it is desirable to examine the sample of n serial numbers 
and test whether this assumption is justified. That is, the hypothesis: 

% the serial numbers were “obtained from a random sample of n 
servations from a uniform: distribution between s and s+p should 


« У AX 


S * a $ “ 


104 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


be tested. The question “Are the serial numbers a random sample?” 
will be studied. 

When the initial serial number s is known, it hasbeen assumed that 
the serial numbers (after s has been subtracted from each serial num- 
ber) are uniformly distributed between 0 and р, where p is unknown, 
The n serial numbers have been assumed to be a.random sample of 
Serial numbers. Let us now consider the problem of testing the hy- 
pothesis that the n serial numbers are a random sample. We note that 


the hypothesis to be tested is not concerned with determining the un- - 


known true value of the production p. Several tests are available for 
the hypothesis that the mserial numbers are a random sample from all 
the serial numbers uniformly distributed between 0 and p, where p is 
not specified. Consider all serial numbers obtained except the largest 
serial number g. If the hypothesis to be tested is true, then this sample 
of the n —1 smallest serial numbers will be uniformly distributed be- 
tween 0 and g, when g is given. Hence, dividing these n—1 serial num- 
bers by g, the numbers obtained will be uniformly distributed between 
0 and 1, when the hypothesis to be tested is true. In order to test the 
hypothesis of randomness, we might test whether these n—1 serial 
numbers (divided by g) are uniformly distributed between 0 and 1. This 
can be done using the Kolmogorov statistic or one of the other sta- 
tistics (e.g., chi-square, maximum difference, etg.)° described im [1]. 
For example, if n —31, a graph of the sample cumulative distribution of 
thé n—1=30 ‘smallest serial numbers obtained (when divided by the 
largest serial number obtained can be drawn). The maximum absolute 
difference between this sample cümulativé and the cumulative of the 
uniform distributio? (the diagonal line) is thén determined. From 
"Table 1 (N —30), on page 428 of [1], we find that the probability is 
.97745 that this maximum absolute difference between the cumula- 
tives will be less than 8/30. Hence, if a test is to be performed at the 
-02255 level of significancey we will accept the hypothesis of random- 
ness whenever the maximum absolute difference between the cumula- 
tives is less than 8/30. » 

If the hypothesis of randomness is accepted, the analysis described 
in the preceding sections herein and in [3] could then be used. If the 
hypothesis is rejected, the sample of serial numbers should be ex- 
amined to determine what is nonrandom about it. On the basis of such 
aiPinquiry ad hoc methods for estimating the true production p could 

_be determined. | 
This approach may also be used to see whether there are changes in 


ə 


[ 


SERIAL NUMBER ANALYSIS 105 


the procedure of serial numbering. If the procedure changes (ie., if 
the serial numbers are not uniformly distributed between the initial 
serial number and the final serial number), then a random sample of 
the serial numbers might indicate a nonuniform distribution. The test 
proposed in this section may be considered as a test of the hypothesis 
that serial numbering was done consecutively, as well as a “test of 
randomness.” 


32, Initial Number Unknown 


3.2.1. Confidence intervals. Let us first consider the problem of ob- 
taining cofifidence intervals for p. 

The probability that the difference d between the largest and smallest 
eamong the n serial numbers is greater than p/b(b=1) may be deter- 
mined by the following relation (see [4], page 386): 


Pr [pz d > p/b} = Pr [12 d/p = 1/5) 
^ 1 € 
= f n(n — 1)z*(1 — z)dz 
1/b 5 
5 = 1 — nb + (n — 1)" 
= Pr jd € p x bd}. 


Buppose the statemeht is made that “the total production pis between 
d and bd," where b is some constant. Then the probability o that this 
‚ Statement will be incorrect is nbi-^-F (1—n)b-"=a. This probability а 
of making an incorrect statement тау bé made small by choosing a 
large value forsthe constant b, or by obtaining a large sample of n serial * 
numbers. We might first determine how small the probability a of 
making an incorrect statement should be, and then determine b or n^ 
from thecelation а — nb1- +(1—n)b—, Tables are available which will 
simplify the computations (see [5], [6]). A reprint of [6] may be pur- 
chased from Biometrika. 2 : s 
Let us illustrate the methods just described by a numerical example. 
It vis chosen equal to 0.65, the value of 1 /b can be determined from 
the entries in column 4—w on p. 174 of [6] where 2(n—1) =m. If n 31 
Serial numbers haye been obtained, then 1/6 is determine@ by the 
entry in the fourth column (v=4) and third row from the bottom а 
(t= 60) of the table on page 174 in [6]. Hence 1/b=.85591 and b= 1.17. 
Pon observing 31 serial numbers, we will be 95% confident in the 
tement that “the total produstion p lies between d and 1.170.” 


° 
е 9 


e е 2 [o 


106 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


The length of the 95% confidence interval for n —31 serial numbers 

is d(1.17 — 1) =0.17d. Since the expected value of d is p(n— 1)/(n+1),? 

the expected length of the interval is (0.17) p(n —1)/(n+1) =0.16p, 

The expected relative length of the interval is \=0.16. We might first 

determine how small the probability o of making an incorrect state- 

ment should be and also how small the expected relative length А of the 
confidence interval should be. Then the relations ' 


(6—1)(m—1)/(m+1)= and nb» +(1—n)b"=a 


can be used to determine b and the necessary simple size n. Writing 
1/b=y and 1/(n—1) =x, the first relation can be replaced by 1/у=№ 
+1+2dz. i : 
Other methods for determining n may also be used; e.g., successive 
approximation procedures. 
3.2.2. Testing hypotheses. The problém of testing the hypothesis that 
the total production is a given value ро may be studied in the same ' 
_ way as was done in Section 3.1.2. Direct computations may be made 
for any test at a given level а of significance in order to determine the 
power function of the test. The tables in [5] and [6] may be used to 
simplify computation. j 7 
3.2.3. Estimates of bounded relative error. Let us now consider the 
problem of point estimation of p from the same point of view as in 
Section 3.1.3. We might want to be “almost certain” that the estimate 
of p obtained, from the sample of n serial numbers “will not be more 
than 1.2p nor smaller than 0.8p.” If-the estimate is of the form cd, — 
, Where c>1 is a constant апа is the difference beiween the largest and 
smallest among the n obsefved serial numbers^then the probability - 
„that the estimate will be between 0.8p and 1.2p is maximized when . 


(LD — 1.2/0) — (3.8) — 08/0) = 0 
or when Wt 


0) e= [(1.2)*— 08/2- 8]. 


The sample size n necessary in order that we çan be “(1—a) -100% 
ee that eg lies between 0.8p and 1.2p is determined by the re- 
ation a, 


2 The reader will notice that the expected valie of d f [3] is 

А Presented on page 627 of 5 

@„Е1)(®—1)/(-+Е1). The formula in [3] was derived for the exact model whereas the formula in this 

text is for the continuous variation model. Hence, 4(n+1)/(n—1) —1 is the unbiased estimate of p in 

Ex wa model (see [3]) whereas the unbiased estimate of p for the continuous variation model is 
n4-2 /( 1). ч 


SERIAL NUMBER ANALYSIS Е 107 


= oe f prc 
0.8/c " ij 
e. = Pr (0.8/0 < d/p x 12/0) 


e 


and relation (1). н + 
If the sample size is larger than the sample required by the preceding 
relations (1) and (2), two constants ci c and с, >с can be determined, © 

where c is defined by relation (1), such that we can be at least *(1—o) 


‚ 100% confident" that any given estimate of the form d times a given 


constant in the interval cid to cd will lie between 0.8p and 1.2p. The 
values of €; and c; are determined by the relations 


1 — a = Pr (08/o 5 d/p £ 12/a] 
and 
1 — a = Pr {0.8/6 < d/p}. 


It may be desirable to determine an interval csd to cad of which we 
can be “(1—a) -100% confident” that the entire interval will lie be- 
tween 0.8p and 1.2p. When the sample sige n is larger than the sample 
Tequired by relations (1) and (2), appropriate values of c4 «c and 
%>c may be determined by the relation is 


° 1°—a = Pr [0.8/5 S d/p € 1.2fea}. 
The numbers 0.8 and 1.2 can be replaced by kı and kə respectively in 


ће preceding discussion to obtain more general results. A justification . 


of estimates of bourided relative error máy be presented, as was done 
in Section 3.43, witkin the framework of the theory of statistical de-" 
“sions. The estimate cd which maximizes the chance of being included 
within аргаа Xap is also the estimate which minimizes the expected —— 
loss if no loss is incurred when the estimate is within Іар and kap anda _ 
constant loss is incurred otherwise. 

à Let us illustrate the computations requifed in the receding discus- 
Sion by considering a sample of n=31 serial numbers. The value of c 
аз defined by relation (1$ is equal to 1.20 (to three significant digits), 
When n=31, Hence, the estimate 1.20 @ maximizes the chance of being 
included between 0,85 and 1.2p. From the tables on page 54of [5] we 
find that the chance is.9999 that 1.20d will lie between 0.8p and 1.2p. К 

Suppose we wish to be 95% confident of all statements made, i.e., 
57.05. The second column (p —30) of the table (g=2) on page 54 of 
[8] Presents the distribution of a. Using this information together with 


108 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


the entry in the eighth column (0:=60) and the fourth row (v9—4) on 
page 175 of [6], we see that c; is about 1.2/(1—.011585) =1.21 (to three 
significant digits). Hence if the estimate of production p based on 31 
serial numbers is 1.21d, then the probability is 0.95 that this estimate 
will be between 0.8p and 1.2p. From the table on page 174 of [6] 
(v: —54, v4 — 60) we see that 


Pr (0.8 < d/p} > .95. 


Hence, we are at least 95% confident that any given estimate (of the 
form d times a given constant) in the interval d and 1.21d will lie*be- 
tween 0.8p and 1.2p. We also find from the tables that the probability 
is about .95 that the entire interval d to 12214 will lie between 0.8p and 
1.2р. 


consider the hypothesis that the n serial numbers obtained are a ran- 
dom sample from the population of uniformly distributed serial num- 
bers. In the case where the initial number is unknown, we consider all 
n serial numbers obtained except the largest serial number g and the 
smallest serial number f. If the hypothesis to be tested is true, then this 


sample of n—2 serial numbers (all except g and f) will be uniformly . 


distributed between f and g, when f and g are given. Hence, subtracting 
J from these n—2 serial numbers and then dividing the numbers ob- 
tained by g—f, the adjusted numbers will be uniformly distributed 
between 0 and 1,,when the hypothesis to be tested is true. In order 
to test the hypothesis of randomness; we might test whether these 
n—2 adjusted serial numbers (when f is, subtraéted from the serial 
humbers and the numbers obtained are then divided by y—f) are uni- 
formly distributed between 0 and 1. This can be done using the Kolmo- 
gorov statistic or one of the other statistics (e.g., chi-square, maximum 
difference, etc.) as mentioned in Section 3:14. For example if n=31, 
the sample cumulative distribution of the n—2=29 adjusted serial 
numbers obtainéd can be graphed. The’ maxi um absolute difference 
between this sample cumulative and the cumulative of the uniform 
distribution (the diagonal line) can then be determined. From Table 1 
(N =29) on page 428 of [1], we note that the probability is .98076 that 
this maxifnum absolute difference between the cumulatives will be less 
than 8/29. Hence, if a test is to be performed, at the .01924 level of 
significance, the hypothesis of tandomness and consecutive (uni- 
formly distributed) serial numbers will be accepted whenever the maxi- 
mum absolute difference between the cumulatives is less than 8/29. 


» 


3.2.4. Tests of randomness and consecutive serial numbering. Let us 


NUMBER ANALYSIS 109 
ud : 4. THE EXACT MODEL 


‘In the preceding sections we have assumed that the serial numbers 
have а continuous uhiform distribution between the initial serial num- 
бег s and the final serial number s+-p. This was done in order to sim- 
Шу the problem and because for practical problems (when the value 
of p is large) the results obtained will serve as an approximation to re- 
sults for the exact model of a discrete, finite, uniform population (see 


- On page 624 of I8], the exact confidence intervals and tests of hy- 
otheses are obtained for the case where the initial serial number is 
«known. Since exact cofifidence intervals and &ests of hypotheses were 
‘not discussed in [3] for the case where the initial serial number is un- 
nown, we shall now consider that problem. 

From [3], we see that the probability that the difference d between 
the largest and smallest among n serial numbers will be less than or 
ual to a given constant c may be determtred from the relation 


Pr[2sc| np} = X; noa- yp — й/р® 


den—l 
= neo? /(p — I) — (n — 1)(e + 1)%/р%®, 


lero c? —cl/(c — m). As a first approximation te this probability we 
t replace the ‘exact model by the model of a continuous uniform 
ibution and obtain Pri(d Se|n, p] =т(сур)"—1-= (п<-1)(с/р)" for 
ich convenient; tables are available (see [5] and [6]). 


erative hypothesis p>po. Then the rejection.fegion for a signifi- 
бе test at level а is obviously d>c:+1 where c, is the largest integer 
ying * ° 


« 
e 


a 


Pr (d 5 ajn, po} «1 — а. 


We wish to test the null hypothesis p —po against the alternative 
pu 818 D Spo, then the rejection region for a significance test at 
«7218 d Sc» where c; is the smallest integer satisfying 


Pr {d < o| n, po} > e. * 


O-sided test at level a of*the null hypothesis р= ре against the 

ded alternative p¥po is defined by the acceptance regiqn 
SPo—1. A two-sided test at the 2a level migkt be based on the 
Ptance region @SdSc+1.° 


Suppose we wish to test the null hypothesis that р= ро against е, 


110 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


The results of the preceding paragraph may now be used to obtain 
confidence intervals. That is, the left-sided 1—a confidence interval is 
p=hy, where kı is the smallest integer satisfying » . 


Pr (d < à| n, ki} <1 — o, 
and d, is the actual difference between the largest.and smallest among 


the n serial numbers observed. The right-sided 1 — е confidence interval 
is p Ske, where Ё is the largest integer satisfying 


Pr (d S do| n, ka} > a. * 


A two-sided 1—a confidence interval is d-+1Sp Sk, and а two-sided 
1—20 confidence interval is kı € p € kə. : 


D 


5. AN APPLICATION 
The Division of the Social Sciences*of the University of Chicago has 
been using equipment (degks, bookcases, ete.) upon which serial num- 
bers had been placed. The question was raised as to how many such 
pieces of equipment were there. 
The serial numbers on thirty-one pieces of equipment were observed. 
The 81 serial numbers obtained were: 


83, 135, 274, 380, 668, 895, 955, 964, 1113, 1174, 1210, 1344, 1387, 1414, 


1610, 1668, 1680, 1756, 1865, 1874, 1880, 1936, 2005, 2006, 2065, 2157, 2220, 
2224, 2396, 2543, 2787, v 


The serial numbers range from 83 to 2787. The sample cumulative dis- 
tribution of the 29 serial numbers obtained between the smallest and 

o largest serial numbers ig graphed in Figure 5.1." The diagonal line in 
Figure 5.1 represerts the uniform cumulative diftributi8n between the 

> smallest serial number 83 and the largest serial number 2787. From 
Figure 5.1 we see that the maximim absolute difference Between the 
two cumulative distributions is (9.65—5)/39=.16. If the seria? numbers 
obtained are a,random sgmple from а population of uniformly dis- 
tributed serial numbers, then there is more than a 1—.68280 —.3172 
probability of obtaining a maximum absolute difference of .16 or larger 
(see page 428, Table 1, N —29, in [1]). Hence the null hypothesis that 
the serial numbers obtgined are a random sample from a population 
of consecutive serial numbers is accepted. 

From Section 3.2.1 we see that the°unbiased estimate of the total 
number p of pieces of equipment is d 32/30= (2787 —83)32/30 
= (2704)32/30 = 86528 /30 = 2884.3 for the continuous variation model 
(2883.3 for the exact model). Also, thé 95% confidence interval for pis 
“2704 Sp 1.17(2704)" or “2704 € p <$3163.7.” 

From Section 3.2.3 we see that the chance fs .9999 that the estimate 

: 6 


^ е ^ ЫЧ 


SERIAL NUMBER ANALYSIS i К f 11 1 ; 


30/29 


28/29 
24/29 
20/29 
16029 


12/29 


5 8/29 
4/29 
a ` 
150 300 600 900  * 1200 1500 1800 2100 2400 ‚2700 3000 
Fro. 5.1. Sample Cumulative Distribution of the 29 Observed Serial Numbers Between the 
Smallest and Largest. 


*1.20d=1.20(2704) =3244.8 will be within 20 per cent of p. This esti- 
mate minimizes the expected loss if noJossisincurred when the estimate 
is within 20 per cent pf p and a constant loss is incurred otherwise. The 
probability is .95 that the estimate 1.21d = 1.21(2704) =3271.8 will be , 
within 20 per cent of p. In fact the probabitity is appfopriately .95 
chance that the entire interval d to 1.21d, or 2704 to 3271.8 will lie 
within 20 per cent of p. ° "ES К 

Tt was a relatively ‘simple task to obtain the sévial numbers of 31 
Pieces of equipment and then to estimate р in the manner described . 
herein. Determining the true valueof p (the total number of pieces of 
equipment) was much more time consuming. These pieces of equipment 
hàd been purchased in the period between 1928 and 1934 and no records 
were immediately available to determine the total purchase. We are 
indebted to Mrs. Ruth Denney, Aministrative Assistant to the Dean 
of the Social Sciences. After several days and many inquiries, Mrs. 
Denny was able to locate the records and found that the tota] number 
P of pieces of equipment was 2885. 

e e 


6. REFERENCES 5 


[l] Birnbaum, Z? W., “Numerical tabulation of the distribution of Kolmogorpv's 
Statistic for finite sample size," Journal of the American Statistical Associa- 
tion, 47 (1952), 425-41. 


, PI Blackwell, D., and M. A, Girshiek, Theory of Games and Statistical Decisions, 


in press, . & 


— 


« 


112 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


[3] Goodman, Leo A., “Serial number analysis,” Journal of the American Statis- 
tical Association, 47 (1952), 622-34. 

[4] Mood, A. M., Introduction to the Theory of Statistics, McGraw-Hill Book Com- 
pany, New York, 1950. 

[5] Pearson, Karl, Tables of the Incomplete Beta Function, Cambridge University 
Press, London, 1932. 

[6] Thompson, Catherine M., “Tables of percentage points of the incomplete 
beta function,” Biometrika, 32 (1941), 151-815 


vot 


b 


j THE PROBLEM OF AUTOCORRELATION IN 
REGRESSION ANALYSIS* 
LJ 


R. L. ANDERSON 
North Carolina State College 


1. INTRODUCTION 


Г LEAST squares analysis, the usual regression model is 
. 


j ^3 H 2 У) 
E Ye = Bo 22 Ха + е, t—1,2,:::,m 
LJ i=l 
where the predictors, the X’s, are assumed fixed in repeated sampling 
and the ев independently distributed with the same variance, c?. The 
X's may be merely dummy vari&tes (0 or 1), as in classification data 
(often called analysis of variance data). When tests of significance or 
confidence limits for the parameters are used, ‘one usually assumes nor- 
mality of the ев. Even if the X's and Y follow a multivariate normal 
distribution, the least squares point and interval estimates of the 6's 
an be used, and the usual null tests applied. Ў 
Tf the nY's are successive observations in time, the experimenter 
frequently wishes to investigate the nature ofthe yesponse curve over 
time. In this case he might set Ху ={#, or he might use the method of 
harmonic analysis to search for periodicities in Y. In other cases, the 
assumed model might involve lagged values of Y as predictors. For 
example, f s 
» p y А 
Y, = bo + X BiY a + є. (2). 
э э 
7 This is an’autoregressive model. Finally one could use a combined regres- 
Sion model with lagged Y's, pregent X’s, lagged X's, apd time as pre- 
dictors. The method of least squares is applicable for autoregressive 
models, provided n is large [see Mann and Wald [6]. 
One of the major difficulties with the use of least squares methods 
with timé series is the strong possibility that the ев are not am 
ent. Aitken [1] pointed out that it is correlation of the ез and not of 
‘the Y's which is to be sivoided? It is possible that if the X's and Y's 
ате both correlated in time, the errors will be relatively uncorrelated: 
А considerable amount of research has been devotéd to the problem — , 
%- 


LÀ E 
S 1952, Paper presented at Annual Meeting of American Statistical Association, Chicago, December 27, 
ч, „° 5 
113 


F П е n 


114 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


of testing for the existence of correlation in the errors, but all too little 
on the more important problem of the best estimation procedure when 
correlations do exist. Summaries of current methods of analyzing time 
series are given by Kendall [5] and Tintner [7]. 

The correlation of successive items in a time series was called a 
lagged serial correlation by Yule [9]. At the present time, it is more popu- 
lar to use the term serial correlation to apply to the correlation be- 
tween two series and the word autocorrelation for this correlation be- 
tween successive items in a given series [see, for example, Tintner 
[7]. I shall use this distinction. Many of the earlier papers on this 
subject, however, use the Yule terminology, as, can be notéd from the 
Bibliography. If we have a set of equally spaced values, Zi, Zo, +++, 
Z,, selected from a population with zero mean, the autocorrelation 
coefficient of lag L is 

cae du (2) 
o?  VSZ?2 SZ. i 
where 7 goes from 1o n— L.! Most writers have preferred to use a defi- 
nition in which the denominator is simply 


S Z3. (3) 
= ihl 
DIREA > 5 

A symposium on autocorrelated time series analysis was held in 1946 
under the auspices of the Royal Statistical Society. M. S. Bartlett [2] 
presented a general paper and Foster [4] and Cunningham and Hynd 
[3] presented papers on’ th® use of autecorrelation methods in non- 
economic fields. J. W. Tukey at the 1951 Annual’Meeting of the Ameri- 
, can Statistical Association proposed the use of the autocovariance 
(the numerator of rz) in his method of spectrum analysis of time 

series, e » 


» 


» 


2. TESTS OF SIGNIFICANCE FQR AUTOCORRELATION 


Yule [10] showed that the distribution of the correlation between 
two autocorrelated series tends to be U-shaped with a majority of the 
correlations near +1. Bartlett [15] said that if the errors were autocor- 
related, owe could use the usual tests of significance of regression co- 
efficients on a preliminary basis. If these coefficients were non-signifi- 
cant, accept the result; if they were significant, a test was needed which 
‘Took account of the autocorrelation. 


» 


— 
1 8 will be used to indicate summation over sample Values. 


AUTOCORRELATION IN REGRESSION ANALYSIS 115 


One of the common methods of analyzing a single time series is har- 
monic analysis, in which the X’s are gine and cosine terms. Fisher 
[18] presemted a test of significance of the various amplitudes (the 
B's), in the restricted case of independent errors. Wilson [41] suggested 
that one compute successive lagged autocorrelation coefficients until 
the first non-significant one is reached; then use this lag (L) as an indi- 4 
cation of the-proportion of independent observations (1/2). 

Three possible models are used to explain stationary trend-free time 
seties data. Wold 18] indicated that the choice depended upon the 
relationship of successive true autocorrelation coefficients, pr. These 
are usually displayed Im a correlogram, as shown in the figure: below, 


(0 Repeated non-damped cycles: use harmonic analysis. 
(ii) Damped correlations but with |p| >0: usé linear autoregression. 
(ii) Damped correlations, with pz =0 for L>m: use thé method of moving 
averages, : 


а 
` Y, = e + У Veera (4) 
е з=1 Ы e 
Tintner [7] also discusses these methods in detail. Bartlett [2] cautions 
about the use of empiriéal correlograms to determine the correct 
Model because successive sample autocorrelation coefficients tend to 
be highly correlated. xy ¢ 
The approximate test ‘of amplitudes in harmonic analysis and the 
decision regarding a proper model depends upon a test of significance: 
for autocorrelation. For this reason the author decided to work on the 
distribution of тг in 1939. BecauSe of the mathematical difficulties in- 


116 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 | 


volved, it was decided to follow up a suggestion of Hotelling to use a 
circular definition " | 
82241 ; : | 


r = 
TL 87 ' (5) 
where ? goes from 1 to n, and Zn = Zi. > 
In 1941, the author studied the distribution for normal Z’s when 
the population mean was zero, and, in 1942, the distribution for Z's 
which were deviations from the sample mean. Significance levels were | 
computed for 7;’, and for several cases of lags greater than one. The | 
theory was simplified by the fact that » Д 4 


ве Улт: 3 

; 2x mi 

where the m's are x? variables with one degree of freedom, and the Ns 
are latent roots of the'characteristie equation of the matrix of the 


coefficients in the numerator. Koopmans [22] reported on the distribu- 
tion of т, as an estimate of p in the simple autoregressive model: 


x = PY + €t. (6) 


At the same time Dixon. [16] was studying the moments of the distribu- 
tion of r’ and used Beta approximations to the’ exact distributions to 
obtain significance levels. T. W. Anderson [14] later showed that no 
test of the hypothesis p—0 exists which is uniformly most powerful | 


T 


Ре 


against alternatives of the Koopmans type. ^ 

Sometime before this, a*problem involving gutocorrelation came up 
in industrial quality control, in which the mean tended to creep up and | 
down slightly on successive obseryations. In order to study the varia- Г 


tion in the production process, von Neumann, et al. [35] suggested that 
the statistic 


ROERO è 
z T 5 Ga-Z»2( – 1) @ 


be used to estimate o?. Williams [40] and von Neumann [83, 34] studied 
the ratio 0/8°, where s?=SZ?/n and Z=Y—Y. Young [42] tabulated 
significance levels of a linear function^of this ratio by use of an Incom- | 
корее Ве{а approximation. Hart [19, 20] tabulated probabilities by use 
of, a series approximation suggested by R. H. Kent, We note that 
#/s?=2n(1—r.)/(n—1), where > i 


AUTOCORRELATION IN REGRESSION ANALYSIS 117 


n—l 
MZ? + Z2)+ S ZZ 
Q0 5T сс с: (8) 


Т. W. Anderson [14].showed that re could Бе used instead of т; to test 
the hypothesis p =0 for Koopmans’ model. T. W. Anderson transformed 
Hart’s significance levels [20] to significance levels of re. 
* ‘A pon-parametric test for randomness by Wald and Wolfowitz [36] 
is based on the numerator of ry’ not corrected for the mean. Wallis 
and Moore [37] developed a series of non-pasametric tests based on 
the signs of differences. Further contributions were made by Rubin 
132], Madow [26] ,Hsu [21], Leipnik [25], Lehmann [23] and Quenouille 
[29, 30, 31]. X 

If we let e be the error vector and с? its covariance matrix, dependent 
upon o?, pı, ps, · · + , раз, Lehmann and Steir [24] have shown that the 
best test statistic to test the hypothesis that all p;=0 is 


came . 
i e'Ie 


Whittle [39] used this method to test the null hypothesis that the data 
follow*a first order moving average against the alternative that they 
follow an autoregressive scheme of first order, and vice versa. 

T. W. Anderson and the author [13] derived the distribution of the 
circular autocorrelation coefficient for residuals from a fitted Fourier 
Series. Significance levels were found and their use indicated. Exact 
distributions were possible because of the correspondence between the 
ene and cosine variables and the Xs in the distribution of the numera- 

ог. д . 

Durbin and Watson [17] derived some approximate tests of auto- 
Correlation of the successive reskluals in least squares regression with 
fixed X's. Let the n successive least squares residuals be 2), 22, · - - , 


та and Watson chose a modification of ће von Neumann sta- 
istic, 


(9) 


n=l e e 
s NAE 
i (Zi 9) S(AZ)* d 
d= топе E ТҮ 57? д = 
s SZ? < e 


118 ' . AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 
oF nd 


во that that d—2(1—r.), where т, was T. W. Anderson's statistic [14] 
to test the hypothesis p=0 for Koopman's model. It should be empha- 
sized that the original von Neumann and’ T. W. Anderson statistics 
do not refer to deviations from a fitted regression; hence the Hart [20] 
and Т. W. Anderson [14] significance levels cannot be used here. How- 
ever, it would appear reasonable to expect that if we have a, large posi- 
tive autocorrelation, d should be near zero; and for a large negative 
autocorrelation, d should be near four. А 

Unfortunately an exact distribution could not be evaluated because 
the regression variables were not latent roots of the numerator matrix. 
Hence, only upper and lower bounds of the significance levels (dy and 
dz) could be computed. This was done for 5%, 2.5%, and 1% one-tailed 
tests, for n —15 (1) 40 (5) 100 and for r=1 (1) 5. It should be noted 
that dy and d; diverge more as r increases and also as n decreases. 

In most cases, the experimenter desires a test of the null hypothesis 
against the alternative of positive correlation. Hence, one should' 


to test for the existence of autocorrelation in the errors (e's).. We note 1 
| 
| 


2с ҸӘ $ 


[^ 
expect a small value of d when the null hypothesis is false, and we | 
should use the following testing procedure: If the computed value, 

d, is less than the tabulated value, d*, the null hypothesis is rejected. 
On the othe? haad if the alternative hypothesis is negative correlation, 
` one would expect a value of d near 4 ‘when the null hypothesis is false. 
‚ In this case we consider q'—4^-d and^test d’ against d*, as above. | 
Since only upper aid lower bounds on the significance lévels are availa- | 
2 ble, we proceed as follows: 
@ If d (or d’) is less than dr, rejec the null hypothesis. 2 j 
(Н) If d (or d") is greater than dy, do not reject. 2 3 Р 
(iii) If dr «d for d") « dy, the test is inconclusive. 1 
Tf the experimenter wants a two-tailed test, he doubles the significance | 
probability and proceeds as follows: 2 
G) If d or d' is less than dz, reject. 0 
ü) Idy «d «4 —dy, d not reject. 
(ii) Inconclusive otherwise. 
! 5 E 
___, An approximate procedure is available for large values of (n—r—1), | 


say greater than 40. In this case, (1/4)d was transformed to a Beta 
distribution, as Dixon [16] did for т, with parameters p and q, where 


e У 


танов IN REGRESSION eae TO 2 
EOL- | 

ody 
р = {(р + ФЕ(0). 


п approximate test statistic їз F= [p(4—a) ]/ad with n= 
:2p degrees of freedom. Or one can use Incomplete Beta tables. | 
and Watson also present another approximation, Formulas for 
nd c'(d) are presented in the 1951 article. Unfortunately, exa 
ignificance levels ‘are really needed for small values of (n—r—1), 
hen dz and dy tend to be wide apart. Be 
urbin and Watson 417, 1951] also present, methods of testing for 
tocorrelation, with one- and two-way classification data and for cur- 
linear regression with equally spaced X’s. E 
An example is presented for each of the three types of regression 
els. Short-cut methods of computing S(AZ)? are presented for each 
. Of course, SZ? is simply the error sume gf squares. For example, — 
th multiple or curvilinear regression, where Y — Y. is estimated by 


2)b(X;— X), AZ=AY— АХ, 


р 00 


(AZ) = S(AY) + E У)ЬЬ8(АХ‹ — АХ) 20) Ь8(АХ ДҮ). 
ы T ON р е i s 18 
pecial formulas can be used for curvilinear fegression, because of the - 
ogonal polynomials used in computing the regression. coefficients. 
oran [28] presents an exact test of autecorzelation of the residuals, 


En only qne predictor is used. He usestthe circular autocorrelation 
efficient, : S : 


e ° Ў у 873 t 
Z»417 21, and gives formiflas for H(R;) and eR). 


3. ESTIMATING REGRESSION COEFFICIENTS WHEN THE ERRORS 
ARE AUTOCORRELATED 


Oed to the problems of testing for its existence. All too little is 

110% of what to do if the errors actually are autocorrelated. Aitken, 
rst showed that if one knew the population covariance matri 

е e's, he could transform the regression model so that the method 


120 . AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


of least squares would give efficient estimates of the 6’s. If the covari- 
ance matrix of the ев is ac* the regression model (in matrix form) 
is Y = X8--e, we premultiply this regression model by the non-singular 


matri H, where : 
: HoH’ = I, 


and I is the n Xn identity matrix. But even if æ were known, the solu- 
tion for H might be very difficult. However, if the e's follow a first order 
autoregressive process with autocorrelation p and variance c?/(1— p°), 
the transformation is quite simple: > oes 


а* = VI = pha, є = є — pen, for l«i£m. 


ao 


The transformations for higher order autoregressive processes and for 
moving average processes are more complicated. A good explanation 
of this is given by Watson [54]. - 

Allowing for the difficulty of making the transformation if o is known, 
the major defect is the lack of knowledge, regarding the true value of 
a. Most time series are too short to enable one to derive good estimates 
of the parameters in o, or even to determine the type of process which 
is operating. A recent attempt to bypass the transformation problem, 
when the ев follow an autoregressive process was made by Champer- 
nowne [44]. He assumes that the model for the єв is i \ 


S D 


ө а > meis — о) = ôn 


a-0 


zero mean and үйгіапсе c*. Champernowne presents the following 


_ where the 6’s are assumed nórmally and independently distributed with 
„ results: | 


. 3? as а weighted quadratic function of observed values of the es. 

(i) Assuming the 7’s are known, estimates of and confidence limits for the 
regression coefficients are derived, both with a known and о unknown. 

(iii) The results in (ii) are derived for o —0. 

(iv) If the y’s are not known, the least-sqares estimates of the regression co- 
efficients are not linear functions of the observed Y’s and X’s ; hence, the 
usual x? distribution theory does not hold exactly. A method involving 
thé application of Bayes’ Theorem was used in this case. 

(v) A brief discussion is given of these, problems when the X’s also have 
disturbances, 


(i) Assuming the y’s are known, @ ‘vas determined as a weighted mean and | 
x 


ЕБЕ" 


; Cochrane and Qreutt [46] indicate three principal reasons that the 
єв ш economic time series models tend to be positively autocorrelated: 


soe * 
> > 
2 f 


121 


(i) Faulty choice of the form of the regression model. 
— (ii) Omission of important variables fron the model. 
_ Gü) Use of incorrect variables or poor data. 


They analyzed the sample residuals,for a number of econometric 
studies, and found many significant autocorrelations, using von Neu- 
mann's statistic, 5°/s*. As indicated earlier, this statistic does not take 
account of the added correlation of the estimated residuals resulting 
from the necessity of estimating the regression coefficients ; this defect 
»becomes worse as*the number of X's increase. In addition 02/82 does 
not take account of the autoregressive nature of many X-variables. 

Cochrane and Orcutt also conducted some empirical sampling ex- 

periments to indicate the effects of autoregressive error processes on 

least squares regression analysis, with the following indicated results: 
(i) The sample residuals tend to,be biased towards randomness, 

(i) The variance of least squares estimates of the regression coefficients are 

very large if the errors are highly autocorrelafed (in their example, p =.8). 

_ (ii) If the autocorrelations could be reduced to p <.3 or perhaps even p <.5, 


by use of a simple transformation, these variances appear to be close to 
those with random errors. ° 

(iv) The removal of trend seems to be a crude but effective transformation 
in many cases. 

(у) If sample residuals are used to estimate the error variance, c*, this esti- 

, mate will be too small if the errors are positively correlated. This result 
can be proven exactly, see for example, Cochran [45]. 
e 


i Cochrane and Orcutt state that for many ееопотіе vafiables, it is a 
simple and practical procedure to analyze the first differences of the 
Various series, If the original regression equation is 

Ф. е e 
e LJ 


S Ү, = в. + De, BiXa + є, 
del 


the transformed equation will be 
ө 
AY, = У BAX u + Ле, 


: here AZ, =Z,1—Z,. This would be the exact transformation for p=1 
ma first order autoregressive model, except that p must be le: than 1 
_ i order to avoid an explosive situation. However, the transformation 


ш be reasonably коба if p i$ near 1, and it is certainly very simple. 


re still highly autocorrelated, one might use the éstimated autoeor- 


T | ation coefficients to try a new transformation. 


om | " 
3 4 е sample residuals, after transforming the variables in this mannery—— — 


е 


122 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


Stone [52] used the method of first differences advocated by Cochrane 
and Orcutt [46] to reanalyze his market demand data [51]. Stone uses 
the von Neumann statistic with the sample residuals to test for auto- 
correlation in the errors. He found the average autocorrelation for 
18 analyses highly significant before transforming and almost equal 
to its expectation after transforming. It was interesting to note that 
the two sets of regression coefficients were not materially different. 

Watson [54] has investigated the efficiencies and estimated variances 


of least-squares estimates of regression coefficients for fixed X’s and, 


tests of hypotheses concerning them, when an incorrect ‘transforming 
model is used. General solutions of the following type are presented: 
bounds on the bias of the estimated variance, lower bound to the 
efficiency of the estimates of regression coefficients and some bounds 
on the significance points of the t- and F-tests. He then discusses the 
following special types of incorrect transformations: 
(i) Assumed and true error processes are both autoregressive. 
(a) Both are first order but an incorrect р, їз used. The greatest bias to the 


estimated variance is a downward bias when p is underestimated. 
This offers some justification for the use of the first difference trans- 


formation, which overestimates p. p is generally underestimated from , 


sample residuals, However, we note low efficiencies of estimates of re- 
gression coefficients when p is overestimated unless p is nearly 1. 
(b) True process is s-cond order and assumed process is first order. Results 
depend on how accurately one knows p; and on the magnitude of рг. 
(0) Assumed and true error processes are both moving averages. 
(a) Both are first order with incorrect pı used. Results in (i) are reversed. 
(b) True process is second order and assumed process is first order. Indica- 
) tions are that an incorrect order is fnore serious for a moving average 
$ than for em autoregressive process. D : 
, (iii) Assumed process is first order autoregressive and true process is first order 
moving average. Even when p; is estimated correctly the bies in the vari- 
ance can be appreciable and the efficienzy quite low. 


Tn all cases the true probabilities for 5% significance levels may be 
considerably different, the bounds being of the order of less than 1% 
to over 10% in many cases of what would appear to be only mildly 
inaccurate estimates. 3 

‚ Watson is rather pessimistic regarding the use of transforming de- 
vices to remove the effect of autocorrelation in least square analysis 
of time series data. However, he believes that more investigations need 
50 be made of correlograms of residuals to see if à good analysis can be 
constructed on tke basis of these correlograms. Quenouille [48] pre- 
sents а test of the hypothesis that a Sample was drawn from an auto- 


EMEN vs oo И 


AUTOCORRELATION IN REGRESSION ANALYSIS . iom 


regressive scheme of specified order and Wold [55] did the same for — 
a moving average process. Similar tegts are given by Bartlett and 
Diananda:[43] and Walker [53]. However more efficient methods are 
needed, and especially we need to determine the proper process and 
order. After all, as Watson [54] remarks, one must use some kind of an - 
analysis, and it is the duty of the statistician to find a good method, 
even if it is not the correct one. 

A series of articles sponsored by the Indian Statistical Institute [47] 
describe the result$ of using empirical sampling methods to evaluate 
the usefulness of the Wold [55] and Quenouille [48] large sample tests 
for short series. Matthaà and Kannan considered three different moving 
average models and S. R. Rao and Som two autoregressive models; 
Series of length 15 and 35 were used. It was shown that both large 
sample tests gave far too many significant results for the short series 
used. Quenouille's test showed'that a second order autoregressive 
model would not fit third order moving average data; however, Wold’s 
test indicated that a third order moving average model could be used 
even if the data were second order autoregressive. This may indicate 
that a moving average model of high ordef is more likely to represent 
‘a given set of data than is an autoregressive model. Or it may indicate 
that Quenouille’s test is more powerful than Wold’s in indicating the 
correct process. It was interesting to note thatin both studies the cor- 
relogram was well estimated, if one knew the.correct process. The third 
paper, by C. R. Rao, presents a sequential procedure for" determining 
the number of sample autocorrelation coefficients needed to estimate 

| the correlogram. Rao advocates the use of likelihood to discriminate , 
between severdl possifile models to represent a given set of data. 

Sastry [49] used the above models and data to investigate the small .. 
sample biaf in the estimates of the autocorrelation coefficients. He 
first compared definitions (2) and (3) and concluded that (2) was 
‘Superior. However (3) is better for small lags and is certainly much 
easier to compute. In general small sample estimates have large biases, 
even for series of 100. The size of the bias depends on the type of model 
(it was much less for a second order autoregressive model than for the 
other four models) and on the values of the parameters in the model. 

Sastry [49] also considers some theoretical results for comparing 
two series of &utocorrelited variates, х and y. He presents the ex- 
Pected values of the means, variances, variances of the means, and co- — 
Variances of x ‘and y, and some higher moments fox normal variates. 

He proposes this new statistic to test the hypothesis that H(z) =p: ef 


e 


124 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


y Ae - » === (n= Boat ne = at 


v= + x 


Vee posu аз 
nk 


with ` 


jS =, унд: PUE l 
f= -1- ево DE 


degrees of freedom. Sastry does not indicate hqw useful i’ will be when 
px must be estimated from the data. One can surely see that relatively 


unbiased estimates need to be obtained. And, most important for re-. 


gression analysis, he presents the expected values and variances of 
estimates of the parameters in E(y) 2 a--z. 
4, FURTHER COMMENTS ON AUTOREGRESSIVE MODELS 


Although the main topic of this paper is a discussion of regression 
analysis with fixed X’s, sothe references on the use of autoregressive 


models will be included. hese models were first discussed by Yule: 


and have been used extensively by economists. The regression coeffi- 
cients in these models are functions of the autocorrelation coefficients. 
Hurwicz [58] shows that, least squares estimates of the parameters 
are biased ir» small samples. As indicated previously, Mann and Wald 
[6] showed that this bias approached zero as the sample size increased. 
Only large sample least-squares variances and covariances of the esti- 
mates of the parapieters are available; hence, ejnfidence limits for the 
parameters and predicted values are available only for large samples. 
Tintner [7] presents an example for a third-order process.» Kendall [5, 
59] gives further information on the use o? least squares to estimate the 
parameters. x 

Bartlett [2] presents a method of estimation based on the concept of 
в continuous rather than a discrete process. Ghurye [57] has developed 
a method of using more of the autocorrelation coefficients in estimating 
the parameters. He introduces a superposed variation for each opera- 
tion, во that, the model is (assuming %=0): 
| (Ye+n) = 25 BY nei) + € 


fae study of autoregressive) analysis is presented by Orcutt 


o чу SEEN 


i 


AUTOCORRELATION IN REGRESSION ANALYSIS 125 


Das [56] uses empirical sampling methods to measure the goodness 
of least squares methods for three equation economic models, in which 
one equation is quantity in terms oF present prices and the others 
involve only lagged variables as predictors. 


5. OMITTED TOPICS 


The following topies, of importance in the analysis of time Series, 
have not been discussed in detail. 


es ®© The estimation of, parameters in a multi-equation system. For a discussion 


of this 4 -procedure, see for example Koopmans [62] and Klein [61]. An 
article by Orcutt apd Cochrane [64] presents : an empirical sampling study 
of the adverse effect of alitocorrelation on*the estimates of structural 
parameters in a multi-equation model. They concluded that, “Unless it 
it possible to specify something about the intercorrelations of the error 
terms in a set of relations and to choose approximately the correct auto- 
regressive transformation, a certain amount of skepticism is justified con- 
cerning the possibility of estimating struotpral parameters from aggrega- 
tive time series of only twenty observations” 

(ii) Comparing two time series. See Bartlett [15, 2], Orcutt and James [65] and 
Moran [03]. 


6. SUMMARY 


Much research has been devoted to the distributions of various sta- ' 


tistics used to test for the existence of autecorrelation of successive 
observations. Others have studied the problem of estimating parameters 
in various stochastic processes, such as autoregressive and moving 
average processes. A summary Of this research is given in this paper. 

Only recently has research» been extended to the problem of testing, 
for the existerfce of aittocorrelated errors in regression models, such as 


z Y, = fo + 2; biXu + в, t=1,2,+:-,%, 
- С i=l ы 
where the X's are fixed predict®rs and the єв are nornfally distributed 
with equal variance. Durbin and Watson [17] present upper and lower 
bounds on the significanee levels for making such tests. Moran [28] 
Presents an exact test for r=1. 

Too little information is available on the próper methods ofeestimat- 
ing the в when the es are aptocorrelated. Aitken [1] indicated the 
exact method of transforming the regression variables when the 


autocorrelations were known. Champernowne [44] added to this — 


general theory and presented agBayesian method when the autocor- 
relations were not known. 


ef 


. 


e . 


126 x AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


Cochrane and Oreutt [46] used empirical sampling methods to indi- 
cate the effects of autocorrelated errors on the estimates of error and 
the 8's. They showed that, in many cases, first differences of the Y's 
and X's would have a relatively uncorrelated error process. A series of 
articles in Sankhya [47] have also used empirical sampling to indicate 
the large biases in testing and estimation procedures with small sam- 
ples. Eo Y 
Watson [54] has shown the seriousness of using the wrong type of 
error process and incorrect estimates of the autocovrelations in trans- 
forming the regression variables. He concludes that the most fruitful 
research seems to be in utilizing more efficiently the estimites of the 
autocorrelations. 4 


BIBLIOGRAPHY 
1. General 


[1] Aitken, A. C., “On least squares and linear combinations of observations,” 
Proceedings of the Royal Society of Edinburgh, 55 (1935) 42-48. 
[2] Bartlett, M. S., “On the theoretical specification and sampling properties of 
autocorrelated time series." Journal of the Royal Statistical Society Supple- 
‚ ment, 8 (1946) 27-41. Ў 
[3] Cunningham, L. B. C. and Hynd, W. К. B., “Random processes in problems 


in air welfare,” Journal of the Royal Statistical Society Supplement, 8 (1940): 


62-85. 

[4] Foster, G. A. R., Some instruments for the analysis of time series and their 
applications to textile research,” Journal of the Royal Statistical Society 
Supplement, 7 (1946) 42361. 

[5] Kendall, M. О., The dtivanced theory of statistics, Charles Griffin and Com- 
pany, Limited, London, 1948. “4 

16] Mann, Н. B. and Wald; A..°“On.the statistical treatment of linear stochastic 
difference equatisns,” Econometrica, 11 (1943) 173-220. > 

[7] p G., Ecünometrics, John Wiley and Sons, Incorporated, New York, 

[8] Wold, H., A study in the analysis sof stationary time series, Alinquist and 
Wiksells, Uppsala, 1938. в У 3 

[9] Yule, G. U., “On the time-correlation problem,” Journal of the Royal Statis- 

. cal Society, 8 (1921) 496-537. » : 

[10] Yule, G. U., “Why do we sometimes get nonsense correlations between time 
series?”, Journal of the Royal Statistical Societ», 89 (1926) 1-64. 

2. Tests of Significance for Autocorrelation : 

[11] Anderson, R. L., “Serial correlation in the analysis of time series," Un- 
published Thesis, Library, Iowa State College, 1941. 

[12] Anderson, R. L., “Distribution of the serial correlation coefficient,” Annals 

t : of Mathematical Statistics, 13 (1942) 1—13. 

[135 Anderson, R. L: and Anderson, T. W., “Distribution of the circular serial 
correlation coefficient for residuals fròm fitted Fourier series,” Annals of 
Mathematical Statistics, 21 (1950) 59-81. , i 


— MÀ 


[14 


[15 


[16 
17 
[18 


DES 
- 


19 


20] 


21 


[22] 
28 


24 


25 


[26 


127] 


[28 
[29 
[80 
[31] 
[32 


[33 


[34 


AUTOCORRELATION IN REGRESSION ANALYSIS а; 


Anderson, Т. W., “On the theory of testing serial correlation,” ocu 
Aktuarietidskrift, 31 (1948) 88-116. ti 
Bartlett, M. S., “Some aspects of the LO problem in regard. p 


` tests of Асар Journal of the Royal Statistical Society, 98 (1935). ra 


536-43. L 
Dixon, W. J., *Further contributions to the problem of serial correlation," 


Annals of Mathematical Statistics, 15 (1944) 119-44. : 
Durbin, J. and Watson, G. S., “Testing for serial correlation in least: Squares 
regression,” Biometrika, 37 (1950) 409-28; 38 (1951) 159-78. = ; 
Fisher, R. A., “Tests of significance in harmonic analysis," Proceedings оў 
the Royal US Series A, 125 (1929) 54-59. 

Hart, B. I., “Tabulation of the probabilities of the ratio of the mean square ў 
successive оО the variance,” Annalg of Mathematical Statistics, — 
18 (1942a) 207-14. 

Hart, B. L; “Significance levels for the ratio of the mean square successive 
difference to the variance,” Annals of Mathematical Statistics, 13 (1942b) 
445-47. ped 

Hsu, P. L., “On the asymptotic distribution of certain statistics used in 
testing the independence between successive observations from a normal 
population,” Annals of Mathematical Statistics, 17 (1946) 350-54. 
Koopmans, T., “Serial correlation and quadratic forms in normal vari- 
ables,” Annals of Mathematical Statistics, 13, (1942) 14-23. 3 
Lehmann, E. L., “On optimum tests of composite hypotheses with one _ 
constraint,” Annals of Mathematical Statistics, 18 (1947) 473-94. hi 
Lehmann, E. L. and Stein, C., *Most powerful tests of composite hypotheses 
I Norak distributions, x Annale of Mathematigal Stgtistics, 19 (1948) 495- 
516. 

Leipnik, R. B., “Distribution of the serial cofrelation coefficjent in a circu- 
larly АН universe, ” Annals of Mi ‘athematical ‘Statistics, 18 (1947) 80-87. 
Madow, W. G., “Note on the distribution, of the serial correlation coeffi- 
cient," Annals of Mathematiéal Statistits, 16, (1945) 308-10. е 
Moore, С. H. and Wallis, W. А., “Time series signifigance tests based on 
signs of differences,” Journal of the. American Statistical Gui Is 38(1948) — 
153-64.. ` - ; 
Moran, P. A. P., “A test for the'serial independence of residuals,” Bio- 
metrika, 37 (1950) 178-81. 

Quenouille, M. H., “Some resugts in the testing of the serial correlation co- 
efficient,” Bioman 35 (1948) 261-67. 

Quenouille, M. H., “Approximate tests of correlation in time series,” 
Journal of the Bogat Statistical Society Supplement, 11 (1949a) 68-84. 
Quenouille, M. H., “The joint distribution of serial correlation ешеш у ў 
Annals of Mathematical Statistics, 20 (1949b) 561-71. 

Rubin, H., “On the distribution of the serial correlation И ic Annals 
of Mathematical Statistics, 16 (1945) 211-15. 

von Neumann, J., “Distribution of the ratio of the mean square successive 
difference tó the E Y. unt of Mathematical Statistics, 12 (19441) 


367-95. 
von Neumann, J., “A further КУ оп {һе distribution of the ratio of the 
E . 
E - Er 
e с 9 


128 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


Mean square successive difference to the variance,” Annals of Mathe- 
matical Statistics, 18 (1942) 86-88. 

35] von Neumann, J., Kent, R. H.,Bellinson, H. R., and Hart, B. I., “The mean 
square successive difference," Annals of Mathematical Statistics, 12 
(1941a) 153-62. 2 

36] Wald, A. and Wolfowitz, J., “An exact test for randomness in the non-para- 
metrie case based on serial correlation," Annals of Mathematical Statistics, 
14 (1943) 378-88. 

[37] Wallis, W. A. and Moore, G. H., ^A significance test for time series analysis," 
Journal of the American Statistical Association, 36 (1941) 401-09. 

38] Watson, С. 8. and Durbin, J., “Exact tests of serig] correlation using neh- 
circular statistics," Annals of Mathematical Statistics, 22 (1951) 446-51. 

39] Whittle, P., Hypothesis testing in time series, Almqvist and Wiksells Bok- 
tryckeri A B, Uppsala, (1951). | 

40] Williams, J. D., “Moments of the ratio of the mean square successive dif- 
ference to the mean square difference in samples from a normal universe,” 
Annals of Mathematical Statistics, 12 (1941) 239-51. 

41] Wilson, E. B., “Periodogram of American business activity,” Quarterly 
Journal of Economics, 48.(1934) 375-417. 

42] Young, L. C., “On randomness in ordered sequences,” Annals of Mathe- 
matical Statistics, 12 (1941) 293-300. 


8. Estimating Regression Coefficients When the Errors are Autocorrelated 


[43] Bartlett, M. 8. and Diananda, P. H., “Extensions of Quenouille's test for? 
autoregressive schemes," Journal of the Royal Statistical Society Series B, 
12 (1950) 108-15, —— . | 

[44] Champernowne, D. G., "Sampling theory applied to autoregressive se- 
quences,” Journal of the Royal Statistical Society Supplement, 10 (1948) 
204-42. * > ? ? 

145] Cochran, W. G., “Some consequences when the assumptions for the analysis 

: of variance are not satisfied,” Biometrics, 3 (1947) 32-35. 

[46] Cochrane, D. and Orcutt, G. H., “A sampling study of ths merits of auto- 
regressive and réduced form iransformations in regression analysis,” Journal 
of the American Statistical Association, 44 (1949) 356-72. 

[47] Indian Statistical Institute, “Thes applicability of large sample tests for | 
moving average and autoregressive schemés to series cf short length—an { 
experimental study.” Sankhyd, 11 (1951) 217-72. 

1. Matthili, A. and M. B. Kannan, "Moving averages." 

2, Rao, 8. R. and R, К. Som, “Autoregressive series." 

3. Rao, C. R., “The discriminant function approach in the classification 
of time series,” 

[48] Quenouille, M. H., “A large-sample test for the goodness of fit of auto- 


dide schemes,” "Journal of the Royal Statistical Society, 110 (1948) 


[49] Sastry, A. S. R., “Bias in estima 
Sankhyà, 11 (1951a) 281-96. 

[50 Sastry, А. 8. RJ "Some moments of moment statistics and their use in tests 
of significance in autocorrelated series,” Sankhya, 11 (1951b) 297-308. 


tion of serial correlation coefficients,” 


AUTOCORRELATION IN REGRESSION ANALYSIS 129 


[51] Stone, R., “The analysis of market demand,” Journal of the Royal Statistical 
Society, 108 (1945) 1-98. * ў 

[52] Stone, R., “The gnalysis of market dÉmand," Review of the International 
Statistical Institute, 10 (1948) 1-13. 

[53] Walker, A. M., “Notes on a generalization of the large sample goodness of 

— fit test for linear autoregression schemes,” Journal of the Royal Statistical 
Series B, 12 (1950) 102-07. 

[54] Watson, G. S., Serial Gorrelation in Regression Analysis, Unpublished 
Thesis, Library, North Carolina State College, also Institute of Statistics, 
Mimeo Series Np. 49, (1951). 

‘[55L.Wold, H., “A largeesample test for moving averages,” Journal of the Royal 
Statistical Society, Series B, 11 (1949) 297-305. 


4. Autoregressive Models y ^ Mj 


, [56] Das, A. C.,'*On the estimation of parameters in a recursive system,” San- 
khyà, 11 (1951) 273-80. 

[57] Ghurye, 8. G., “A method of estjmating the parameters of an autoregressive 
time series," Biometrika, 37 (1950) 173-78. 

[58] Hurwicz, L., “Least squares bias in time series," Chap. 15 in Koopmans, 
T. C., Statistical Inference*in Dynamic Economic Models, John Wiley and 
Sons, Incorporated, New York, 1950. 

[59] Kendall, M. G., “On autoregressive time, series," Biometrika, 33 (1944) 
105-22. 

*(60] Orcutt, С. H., “A study of the autoregressive anture of the time series used 
for Tinbergen’s model of the economie system of the United States, 1919- 
82,” Journal of the Royal Statistical Society Supplemest, 10 (1948) 1—58. 


5. Other Topics P 


[61] Klein, L. R., “The use of econometric models as à guide tb economio policy," 
Econometrica, 15 (1947) 111-51." 

[62] Koopmans, T. C., Statistieal Inference t Dynamic Economic Models, , 
John Wileyeand Son$, Incorporated, New York, (1950). 

168] Moran, Р. А. P., “Some theorems on time series,” Biometrika, 34 (1947) 
281-91; 35 (1948) 255-60. ^ 

[64] Orcutt, G. H. and Cochrane, D., “A sampling study of the merits of auto- 
regressive and eeduced form transformations in regression analysis," Journal 
of the American Statistical Association, 44 (1949) 356-72. € 

[65] Orcutt, G. H. and James, S. FS “Testing the significance of correlation be- 
tween time series,” Biometrika, 35 (1949) 397—413. 

LJ 


= 


у 


JOINT CONFIDENCE REGIONS FOR ,MULTIPLE 
REGRESSION COEFFICIENTS 


Davi» DURAND 
National Bureau of Economic Researzh 


ORE and more statisticians are coming to realize that conventional 
M confidence intervals are not strictly applicabie to problems re- 
quiring the estimation of several parameters. In multiple regression а 
conventional interval may be correctly determined for one, ánd usually 
only one, of the regression coefficients. Ordinarily, however, the statis- 
tician wants a measure of accuracy for each of his coefficients, but if, 
he obtains these in the form of conventional confidence intervals, he 
usually-commits a fallacy. Here we discuss the nature of this fallacy 
and a possible remedy through the use of a joint confidence region. 


1, MULTIPLE CONFIDENCE STATEMENTS 
In classical multiple regression it is assumed that the dependent 


variate Y is normally distributed with constant variance about a linear» 


function 
(1.1) ЕО 0Xs Хх, 


in which the zoefücients 5; as well as the variance т? are unknown param- 
eters. From a set of л> k--1 error free and linearly independent ob- 
servations on the X;'s and % corresponding values for Y, one obtains 
the estimates b; and 22 Һеге understood to ЬУ maxiraum likelihood 
estimates. Then, the theory of confidence provides criteria for judging 
the accuracy of these estimates. > 

After deriving the k-variable regression function (1,1) many statisti- 


` cians would want exactly k+-2 confidence statements—one for each of 


the k+1 b;s and one for c*?. But this rule is far from general. Personal 
preference or the requirements of the problem may dictate 1, 2, · · 
k4-2, or even more statements. Since the theory of confidence permits 


еш for linear combinations of the type 


Cobo + eibi + eb +--+ + cubs, 


' there is literally no limit to the number of confidence statements that 


can be considered: 

The conventional form of confidence interval for a partial regression 

coefficient is determined by the relation 
y ` 180 


уу E. x "n 


э» 
> 


JOINT CONFIDENCE REGIONS К Y -181 e 


(1.2) lo Su d as 
n , vaPnó*/(n — ЕЖ 1) iS 
where ta is the upper $a point of ‘Student’s’ ratio for n—k—1 degrees 
of freedom, and a”? is the element corresponding to ay, in the inverse of 
the matrix |а: =|} 3 8i (Xi — Xi) (Xa— Ху]. Given a value for a, 
say .05, (1.2) determines an interval that will cover the true parameter 
value b, with probability 1—о.! But although these conventional inter- 


"vals are entirely valid when properly applied and interpreted, there 


are two basic fallacies or improprieties that frequently arise in practical 
work. The first consists in deciding after the experiment what confi- 
dence statements to make. The second consists in making several indi- 


* vidual statements at level 1—a in a way that implies a joint statement 


at the same level. 4 

Concerning the first fallacy, it is common practice in statistical 
studies in general to go over, ће data with a fifle-toothed comb, to apply 
a battery of significance tests, and then to select a relatively few con- 
clusions that seem particularly noteworthy. In regression studies it is 


, common to experiment with several equations before selecting one for 


presentation. This might consist in calculating a regression equation 
with five variables, discovering that tavo of the coefficients do not differ 
significantly from zero, and then recomputing the equation with three 
variables. But procedures such as these may introduce bias, as may be 
seen from an extreme example. Suppose that fhe true regression coeffi- - 
cients were all zero throughout a series ef experiments, and suppose 
that the expeyimentey made a practice of presenting only regression* 
equations with coefficients significantly different froin zero at the level 
a. Then, if he made confidence statements for these coefficients, his © 


"probability of being right would riot be 1—e at all, but zero; for he 


would make a st&tement only when the parameter value zero lay out- 
Side the confidence interval. e € 

The second common fallacy in using conventional confidence inter- 
vals is the implied joint etatement. In a two-variable problem—such 
as that presented in Section 3—we might make the following three con- 


ventional statements at the 95 per cent level:* = ° : 
а кы уын rS 
J e 


1 Perhaps it is necessary, for the record, to say a word about the meaning of “probability” in this 
context. Once the experiment has been performed, the interval either does or doesnot cover the point bp, 
and the probability if therefore 1 or 0. Before the experiment, however, we алу argue that the ouieome 
18 uncertain and that the probability is ргорефу 1 —a. And before performing а series of experiments, 
[тз that 1 —о is the proportion of projected confidence statements that will be correct in the 

os 


е LE 
e 


132 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


—04s bh 5.44 

(1.3) 522 0 Ss .98 

91 S bi + be < .99. 
The first statement locates the parameter point (bi, bs) within an infinite 
band bounded by two lines parallel to the bs axis in two-dimensional 
parameter space (Figure 2) ; the second locates the same point in a band 


between lines parallel to the b; axis; and the thire locates the point 
between two parallel sloping lines. But the three statements, takeii all 


‘together, locate the point within the hexagon formed by thie intersec- 


tion of the three bands. So, if the confidence level of the individual 


' statements is 1—-a=.95, the level of the joint statement is demon-, 


strably less. 

The actual confidence level of a set of statements like (1.3) can be 
readily obtained for one or,two special cases, the most obvious of which 
occurs when, in a k-variable problem, the cross-products a;;(i74j) 
= UG T X)(X5— X; are all zero and the variance c? is known. 
Then, the sampling distributions of the individual b;s are all inde- 
pendent, and the joint confidence level for any subset containing. 
exactly m of these coefficients is therefore (1 — a)". Another special case, 
involving differences between means in the analysis of variance, has 
been discussed by Tukey [11]. But in general the joint confidence level 
of statements like (1.3) js not readily obtained, though a lower bound 
can be obtained as shown in Section 5, 

As an alternative to calealating the confidence level of joint state- 


' ments like (1.3), Scheffé [9]; Roy and Bose [8], and the author propose 


to define an infinite set of intervals whose totality is equivalent to & 
joint confidence ellipse at the level, 1—a. Although confidonce ellipses 
have been understood in theory for some time, they have received little 
practical application—notwithstanding an important econometric ex- 
ample by Haavelmo [4]. Possibly, thé’ unpopularity of the ellipse is 
due partly to the difficulty of representing it graphically—except in two 
dimensional examples, like Haavelmo’s—but this is largely obviated 
by the proposed technique of substituting an infinite set of intervals. 

The use of the ellipse with an infinite set of intervals has two distinct 
advantages. First, the calculations required ave no more difficult than 
those required in the conventional approach. Second, if a finite or 
infinite subset of the intervals is chosen in any way осте before 
or after the experiment, the confidence level cannot be less ud 1—0; 
thus the fallacy of choosing statements after the experiment is avoided. 
At the same time, this approach has one drawback; whenever interest is 


3 
У > 


JOINT CONFIDENCE REGIONS 133 


limited to a finite subset of intervals that can be specified in advance, 
the confidence level will actually exceed 1—a, and the intervals will 
+ћегеѓоге be larger*than necessary. For example, Tukey’s method for 
contrasting means in the analysis of variance allows for just $k(k—1) 
specified contrasts among k means; and for k>2, as shown by Scheffé 
[9], Tukey’s intervals are smaller than those derived from the joint el- 
lipse, and the difference ihcreases as k increases. However, it should 
% be remembered that the joint ellipse is proposed primarily for investi- 
>. gations where thé specification of questions in advance is not con- 
venient. The next section presents an example characterized by a pro- 
liferation of possible questions and great need for flexibility, and it is 
for problems of this sort that the-joint ellipse, with its infinite set of 
* intervals, is ideally suited. 


In an exploratory cross-section study of agventeen New York bank 
stocks for February 1951, the dependent variate, log P, had an esti- 
mated variance 22 = .0006863 about the estimated regression plane 


(21) — log P = .037 + .65 log C — .95 log S + .26 log D, 


TABLE 1 


MEANS AND SUMS OF SQUARES'AND PRODUCTS ABOUT THE 
MEANS FOR REGRESSION ANALYSIS OF 17 NEW 
YORK CITY BANK STOCKS, FEBRUARY 1951 


Note: The variables in this example were all expressed as common logarithms with 
2-place mantissas, The sums of squares and products were rounded to 5 decimals 
for use in calculations. 


| 2. AN EXAMPLE 
| 
| 


4 a “ 
Xi 
л а D Dividends s 
| " Log Capital L8g Shares 2 aa bog Price 
(year gm (Februâry pursoments 1951 RA 
1950, unit: , 1951, unit: 1950, ші , unit: 
$10,000) 10,000) 140. 059 8-00) 
Means 3.9288 1.9353 4.5206 1.9512 
zs of Squares and e e 
roducts about th 
| Means x . ° 
| x 3.25538 3.51281 3.42601 — -30838 
om 7.25042 3.55875  —3.05901 
n ° 3.09849 — .16441 
Y 3.23758 


re 194. AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 
which then lead to the exponential form 
(Q2). Р = 1.690S--"D*. 


In these equations, P is market price (end of February 1951), C is tot 
capital funds (end of 1950 in units of $10, ,000), S is total number 0 
shares outstanding (end of February 1951 in units of 10,000), and . 
is total dividend disbursements (during 1950 in units of $100). The / 
means and the a;; matrix for the logarithmic variables are given ; 
Table 1. The forward Doolittle solution, which will be needed in 258 
pubescent discussion, is given in Table 2. “ : 


TABLE 2 
FORWARD DOOLITTLE SOLUTION FOR REGRESSION 
COEFFICIENTS IN EQUATION (2.1) 


Note: Although entries have been rounded to 4 decimals for illustration here, 
the original matrix (see Note, Table 1) was obtained to 5 decimals, and subse- 
quent calculations were carried to 8 or 9 decimals. 


a 


Xi Xs ; Xx 


х Y 
Log Log Log Y 
* Capital Shares Dividends ов Сек ш 
($10,000) (10,000) ($100) Vets 
5 
8.8554 3.1098 > 8.4260 .8084 10.5026 
01.0000 * -1.07917 —1.0524 — .0947 —3.2262 
- . 8.5128 7.2504. - ^ 83.6528 , 3.6596 17.9856 
—3.5128 —3 7906 ^ —8.6969 -3328 — , —11.3831 
2.46088 | — .1482 E 3268 — 6.6525 
5 —1.0000 .0418 — .9591 —1.9178 
2 2 i 
3.4260 3.5538 "3.0085 * 16и, 10.8427 | 
-8.4200 | —3.6069 —3.6056 — .8245 —11.0581 
.1432 = .0059* 1378 ‚27146 
.0870 — .0228 .0642 
—1.0000 ‚ .2622 — .T878 


Though the form of equations (2.1) and (2.2) is convenient for com 
putation and meets the needs of an ertimatirg equation, it does 
adequately describe the structure of the bank stock market. For this, 
one wants to relate stock prices to such variables as book value and 
dividends per share. However, a simple linear transformation on the - 
logarithms of me Se pendent: variables:in (2. 1) and (2. e 


JOINT CONFIDENCE REGIONS ; i Ads 185. 
log C = log C 

log C/S = log @ — log S 

log D/S = log 0 — log S 

produces a new equation 


(2.3) Р = 1.09€—*(C/S) **(D/S)-**, 
where the transformed independent variables are total capital (which 


D ° 


* iùdicates size of bank), capital per share (book value), and dividends 


per share. In this transformed equation the new coefficients (or ex- 
ponents) are all linear combinations of the 6riginal coefficients, thus: ' 


— .04 = .65 + .26 — .95 
69 = „95 — .26. 


The variance, 2° = .0006863, is unaffected. « „ 

At the time this regression analysis was performed, most bank stocks 
were selling at substantial discounts from book value—a fact that 
worried many financiers. Again, suitablé equations for studying dis- 
counts can be derived from (2.2) by other linear transformations on 
the logarithmic variables. One possible equation is 


(2.4) °Р/В = 1.09B-"(D/C) ag-ot 


where B=C'/S represents book value. Here, as in (2.3), the new coeffi- 
cients are all linear combinations of the coefficients -of (2.1), and the 
variance remains unchanged эз Ж б 

Thus, by thie naturé of this problem, it is possible to start with a basic 
equation, (2.1) or (2.2), and to derive from this by suitable trans- и 
formations'a series of special purpáse equations. This process, however, 
multiplies the number of régression coefficients and combinations for 
which confidence statements аге required. In addition to the constant 
term, which is not affected by the illustrated transformations, equa- 
tions (2.2), (2.3), and (2.4) contain five different regression coefficients, 
and if all the ramifications of the problem were to be explored, more 
equations and coefficients would undoubtedly arise. Moreover, it 
should be realized that this example was artificially cut down for 
simplicity in presentation. A ‘systematic study of bank stock prices 
should.contain anywhere from five to ten basic variables and upwards 
of twenty-fvé transformed variables. 2 D 

To avoid the fallacies of multiple statements in this problem, where 
80 many statements are possible, a joint confidence ellipse will be 


° 
" 

= i E 
« ^ t £ € 


136 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


determined for bı, bz, and bs. Although the constant term bo could be in- 
cluded in the ellipse (see Section, 6), this particular problem is prima- 
rily concerned with the structure of the market as indicated by bi, b», 
and b;—not with the general level of the market as reflected by bo. How- 
ever, a statement may be desired for the variance c?, and this can be ob- 
tained in the conventional manner provided no joint relationship is 
thus implied for c? and the b;'s. 

3. THE JOINT CONFIDENCE ELLIPSOID 


A joint confidence region for the k regression coefficients, obtaine 
when a single dependent variate is regressed проп k independent vari- 
ables X;, X», - + +, X, is given by the ellipsoid 


k h 32 
(n —k— D» a; (b, — b) (b; — b) 
(8.1) Falk, n — k — 1) =. mires i 
= kné? 


where Falk, n —k—1) is the upper o point of the F-distribution for k 
and n—k—1 degrees of freedom, n is the number of observations, 


аӊ= > (Ха X)(X4— X), Îi is the maximum likelihood estimate ' 


of the true regression coefficient b;, and ô? is the maximum likelihood 
estimate of the variince? To apply (3.1) a value.of a is chosen, say 
.05, and the single statement is made with probability 1—a=.95 that 
the parameter point by, bz, · - ‚ b lies within the ellipsoid. 

To illustrate a joint confidence region graphically, the prices of the 
seventeen New York City bank stocks wete regressed on the two vari- 
ables dividends pe»share, D/S, and book value, 67/5; and the following 

- resulted, with variables now expressed in dollars: 


(3.2) P = 2.15(D/S) (CJ) 5s. ; : 


A 95 per cerit @llipse was then determined by inserting in (3.1) the 
appropriate numerical values, including F4(2, 14) =3.739 and 6? 
= .0008806. A graph is shown in Figure 1. Strictly speaking, this con- 
fidence region applies only to points in two-dimensional parameter 
space—that is, to paired values of b; and b». Thus the combination 
= .30 and b, = .65 lies within the ellipse and is admissible, whereas the 
combination b,=.15 and b,—.70 lies outside and is inadmissible. It 
will be noticed immediately, however, that for certain values of b:— 

з For derivation of (3.1) see Wilks [12, Sec. 8.3]. In ollowing the derivation, one must remember 


that Wilke makes one of the Xis, say X:, arbitrari ity; bim CEA 
ae bs herein, and Pin his notation is the same aa E There — o he notation ia овон 


э 
» 4 7 ` 


CONFIDENCE REGIONS ; = 137 


mely, those less than —.09 or greater than .49—all points lie outside 
ipse. Likewise, for b; less than .45 or greater than 1.05, all points 
outside. Thus, we obtain the intervals ; 


—.09 Sb x 49 
45 € b < 1.05, 


ch of which corresponds to a confidence level exceeding 1—а. Al- 
ough intervals of this type have been called confidence intervals 


(21 


4 


ZO 


3. - А 4 45.5, 


> Fig, 1.95 per cent joint confidence ellipse and subsidiary 
intervals for (3.2). 


£c. 

are here referred to as "subsidiary intervals" (subsidiary to the 
se) in order to distipguish them from the conventional intervals. 
imilar subsidiary intervals can be derived, all from the same ellipse, 
linear combinations of the regression coeffigients. In regressions em- 
Oying the exponential ByU;hUs* - - - Uè the degreeof this homogene- 

function is often of intefest, and this is indicated by the sum 
zh... pb, In production studies this sum indicates the degree 
turns to scale; in the current example, (3.2), it'indicates whether a 
In production studies it is usually desirable to assume that the independent variables are subject to 
The treatment of this problem and a substantial bibli are given by Tintner [10]. 
x 10 в, 


" © ^ E КР, 


138 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


stock split will result іп a proportional reduction in the price of the 
stock. The subsidiary interval for the sum b; -b; in (3.2), 
? г 


1 


(3.4) 90 S b + b; S 1.00, 


can be established graphically by drawing lines of the family y=b,--b, 
in Figure 1 and regarding as inadmissible all those that do not meet the 
ellipse at any point. 
' The totality of all possible subsidiary intervals is, in a sense, equiva- 
lent to the joint ellipse. The combined intervals (3.3) restrict the parain- . 
eter point (01, bz) to the rectangle BGEH of Figure 1, which includes all. 
of the ellipse. The combination of (3.3) dnd (3.4) further restricts this 
_ point to the hexagon ABCDEF, which again includes the ellipse. If 
- this process is repeated by combining still more intervals—say b.—b: 
or b,—3b;—the resulting many-sided figure can be made to approach 
the ellipse as closely as desired. Hence the totality of all possible sub- 
sidiary intervals has a joint confidence level of 1—a. Accordingly, if 
one performs a series of experiments, he may expect that in at least 
100(1—о) per cent of them, all of the subsidiary intervals, however 
chosen, will cover the true parameter point. 


4, SUBSIDIARY INTERVALS IN k DIMENSIONS 


In a two dimensional problem like the preceding, subsidiary limits 
can be derived graphically with fair accuracy. But when greater accu- 
_тасу is desired, or when more than two dimensions are involved, an 
analytic procedure.is indicated. The problem in, general is to set limits 
for linear combinations 3 


(4.1) Q- Y Mj 


icy 


where the constants h; are given. If arbitrary values are assigned to Q, 
equation (4.1) defines a family of hyperplanes in k-dimensional param- 
eter space. In particular there are two distinct quantities Q and Q 
(02) that define two planes tangent to the confidence ellipse (3.1). 
Hence all members of family (4.1) having the, property 0>0 or Q<@ 
will lie completely outside the ellipse, and the subsidiary statement is, 
therefore, Qx Q x 0. > 

Finding the tangent planes and the quantities Q and Q may be facili- 
tated by transforming the confidence ellipse (3.1) into a sphere with 


E 


JOINT CONFIDENCE REGIONS -139 


its center at the origin. For this purpose a linear transformation is re- 


quired* 
А е 
* d k 
(4.2) di = b; — b; =) ciô; 
el < 
such that : ‘ 


k 


k $5 kh k k 
(43) E D aup- b) Os — b) = У Z айй = Xs. 


i=l 51 iml jal 


e с 
By means of (4.2) the confidence ellipse is transformed into the sphere 


Falk, т — k,— 1)kné? А 
(4.4) Pies chr 
n—k-—1 о deb 
ә 


and the tangent planes O= XAb; and Q= Уф; are transformed into 
new planes tangent to this sphere. For Q (a similar relation holds for 
‚ ©) the new plane is $ 


k k 
Уһ, — 0 = DL т, P 
^ iml jl 
where А . 


k 
. 
ту = 2c, e 
s t iml и 


A solution for 0 is now obtainable,from 


" 


-Zhi “ЖЕ ЮТ 
(4.5 = pts НН 
| VÈ mẹ? 4 n—k-—1 


Where the left hand merhber is the distance between the transformed 
Plane and the origin, and the right hand member is the radius of the 
transformed confidence sphere (4.4). A solution for © can be"obtained 
by taking a negative value forsone of the square roots in (4.5). There- 
fore, the Subsidiary interval takes the form 


Гы е 
m. Such a transformation can be found—in fact a number of different transformations can be found— 
linearly n= 0f the ai/'s is of rank А (Bécher (1, p. 134 f£]. Moreover, the rank will be X if the Хез are 
early independent as assumed (Wilks [12, р. 160]. 


* ; ©) E 


(42 


E 


140 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


k „(ул — k — Ding; m? 
Sie em k = 1)kn6?5 m. 


$—k-—1 


› =k- ĝ? 2 
= Уһ = УА, Y pom n — k — 1)knê?) mj ў 


п 6—1 


(4.6) 


In the above, it can be shown (though the proof will not be given) 
that the quantity 

ne? >) тр > v» 
n—k— l $ 


5 


is the unbiased estimate of the variance of the linear combination 


Q= X, Then (4.6) can be written 
Q — УзР, n — k — 1) Est. Var. Q 

2 Q <Q + ЕРЕ, л — k — 1) Est. Var. б, 
which is essentially the result derived by Schefté [9] for contrasts in 


the analysis of variance and "by Roy and Bose [8] for the general linear 


case. 

Of the many possible ways of deriving transformation (4.2), the one 
chosen for illustration here is conveniently used in, conjunction with the 
Doolittle solution for the-normal regression equations. In essence, the 
quadratic form on the left of (4.3) is reduced to a sum of squares by the 
method of Lagrange (see Bócher [1, p. 131]). The reduction is performed 
by the following transformation: J 


ài = (audi + aids + aids + +++ + auth) / Маа 
as) & = (а'0 + ads H + - - + ави) / уат? 
& = ^ aj, 67 di / Aag D 
in which 4 
Qij! = Qij — ааа /ап 
[1 


Qij” = аў — ааз'а у / aa 
в 


Thus, all of the entries in (4.8) arise in the regular reduction process of 
the forward Doolittle solution. Finally, the required transformation 
(4.2) is obtained by inverting (4.8), which is easy. 

» 

» 


JOINT CONFIDENCE REGIONS à 141 


As a numerical example, Table 2 shows the standard forward Doolit- 
tle solution for the regression coefficients of (2.1) and (2.2). The stand- 
ard form 1s commonly found in texts (see, for example, Croxton and 
Cowden [2, рр. 716-20]) and is therefore suitable for illustration. For 
practical computation, the abbreviated Doolittle is probably superior 
(see Dwyer [3, pp. 107-12]). Transformation (4.8) would be derived 
from the italicized quantities in Table 2 as follows 


& = (3:2654d, + 3.51284, + 3.42604:) /1.8043 
„ô= (3.4688d, — .1482d;)/1.8625 
ôs = А . .0870d;/.2950. 
‘This is easily inverted by solving for the d,’s in terms of the 6,’s, thus 


Ф = .55425, — «57946, — 3.71895; 

(4.27) ds 58698 + «14002; 

d = ; 3.39032. 
Now suppose that a subsidiary statement is desired for the sum 
bi+b:+bs in (2.1), whose maximum likelihood estimate is —.04. The 
same subsidiary statement is applicable, of course, to the exponent of 


M in (2.3) and of S in (2.4). By means'of (4.2’).the sum is transformed 
us : 


h-hq-h-Q-dácd4cd4 ^ 5 


= тб * : 
: ъ= 55428. — 042582 — "188653. 


From the above, one calculates the quantity 
$ e Ym? = .3445. 


This is substituted in (4.6) along with k=3, n=17, 2? —.0006863, and 
E 18) =3.410, and the resulting interval is —.096Sbitb:+0s 
016. z 


5. THE JOINT CONFIDENCE ELLIPSE VERSUS CONVENTIONAL 
CONFIDENCE INTERVALS 


Dd Scheffé [9] has discussed at some length the relation between 

кы ее éllipse and the conventional form of confidence interval 

pow analysis of variance, only*a brief extension to regression prob- 

ms is needed here. In Figure 2, which again refers to (3.2), the con- 
© is £g 


142 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1054 


ventional 95 per cent limits, already given in (1.3), are shown super- 


“imposed on the ellipse of Figure 1. The hexagon A'B'C'D'E'F' formed 
by the intersection of the three conventional bands is similar and sim- , 


ilarly situated to the circumscribed hexagon ABCDEF in Figure 1, 
and one may surmise that the smaller hexagon encloses an ellipse sim- 
ilar to the 95 per cent ellipse. Geometrically, then, the conventional 


e 


-----—ACUP7M-O---------2-2-----2---p------ 


-------2------------d 


4 27 3 4 T 8 7 .8 9 4.0 “4 


Fr. 2. 95 per cent joint confidence ellipse and conventional 
* 95 per cent confidence intervals for (3.2). 


^ n. 
intervals of Figure 2 bear the same relation to their inscribed ellipse 
that the subsidiary intervals bear to the 95 per cent ellipse; that is the 
smaller elipse is equivalent to the totality of all possible conventional 
intervals and provides a lower limit for the joint confidence level of а 
set of statements like (1.3). . 

To determine the confidence level of the ellipse inscribed within 
A'B'C'D'E'F', it is convenient to rewrite (1.2) 


d 1 


By — tV Est. var. б, S by Š b, + tav Est, var. by. 


y D 
D 


1 


JOINT CONFIDENCE REGIONS 2 ^ 143 e 


Then, on comparison with (4.7), it is apparent that 
(5.1) . etan = УКЕ. (Е, п k — 1) 


е 
or 
Far(Ipn — Е — 1) = Е.Е n — k 1), 


where 1—«’’ is the confidence level of the conventional interval and 
1—a' is the level ‘of the inscribed ellipse. The problem of finding a! | 
given a”, or vice versa, is conveniently solved by means of Pearson’s 
Tables of the Incomplete Beta-Function [7]. In, the example under dis- 
cussion, k =2 and і о = 2.145 for fourteen degrees of freedom; then (5.1) 
«indicates Р (2, 14) —2.3005. By means of the transformation described 
by Pearson [7, p. xlvii] or Mood [5, p. 206], approximately .86 for 
1—o' is obtained from the beta-function table. Thus the probability 
that (1.3) is correct is bounded by .86 and .95, Table 3 presents values 
of 1—a’ corresponding to tHe conventional .95 and .99 levels of 1—0 
for several other combinations of k and n—k—1. For large samples and 
special situations where the variance is khown, the limiting value of 
l1—o' for Р. (Е, ©) is obtained from: Pearson's Tables of the Incomplete 
Gamma-Function [6]. 


t 


e . 
8. INCLUSION OF bo IN THE CONFIDENCE ELLIPSE 


Iu some applications it may be desirable to extend the confidence 
ellipse so as to cover the constant term bo. Define an arbitrary inde- 
pendent variable X,— 1. Then, bois merely the partial regression coeffi- „ 
cient of Y on Xo. Let & m А 


$ . : уз XaXg = Ai. 
In particular, Аш= D Xy’ =n and Aoj= ХаХа = Ae Then, as 
Shown by Wilks [12, Sect. 8.3] and Mood [5, Sect. 13.5], the joint con- 
fidence ellipsoid is defined by 


ь K 
(n - k — 1); D AÂ: — b) b) 


This, like (3.1) may be transformed into a sphere by the method of 
Section 4. : 


° 
°з 
s g 


c 


144 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


TABLE 3 


LOWER BOUND OF PROBABILITY THAT ALL CONVENTIONAL 
CONFIDENCE INTERVALS WILL BE CORRECT FOR k 
REGRESSION COEFFICIENTS IN SAMPLES OF n 


Conven- 


7 Degrees of Number of Independent Variables k 
tional 
Confidence Bree on 
n—k-l 2 3- 4 5 6 8 10 
Level 
.95 5 .88 79 .70 .62 ‚58 .38 .27 
7 87 78 .67 57 47 51 19 
10 87 76 .65 53 43 26 14 
15 86 75 .62 50 39 21 11 
20 .86 74 .61 48 37 19 .09 
30 86 74 60 .46 34 17 07 
60 86 73 .59 .45 32 15 06 
© $5 72 57 .48 30 13 05 
.99 5 .97 .95 .92 .89 .85 SM. +69 
7 97 ' .94 91 .86 81 71 59 
10 97 94 .89 .84 78 64 50 
15 97 93 .88 .81 74 58 42 
20 " 97 93 .87 .80 72 54 38 
30 9% 92 .86 48 70 50 33 
60 ‚96 92 +85 «77 67 47 29 
© 96 92 .84 .75 64 42 24 
“= CONCLUSION 


In the simon pure application ofthe conventional approach, where à 
single statement is specified, it is possible to establish an interval with 
a definite probability 1—o. The same is true for certain special prob- 
lems involving multiple statements—such as Tukey’s problem of con- 
trasting means. But in general, we must conclude that establishing а 
single probability for a set of multiple statements is either impracticable 
or impossible. Instead, there will be two bounds, 1—a’ and 1—@” 
betweet! which lies the probability of being right. How, then, are these 
gounds to be chosen? D à 

The common procedure to date, of establishing the upper bound 
without regard to,the lower, can hardly be justified. To take an ex- 
treme example from Table 3, what does it mean if we establish inter- 
vals for a 10-variable regression and all we can say is that the probabil- 


2 
2 


D 


A 5 * 


OO CPQ-———— Mn 


ИЧИГ. ХА 


ENCE REGIONS 145 


of establishing the lower bound without regard to the upper, 
mended he?e, will be'criticized? no doubt, as unduly conversa- 
whether this criticism is justified or not will depend on the 


Lin choosing them. In the bank stock example, where a multiplic- 
f statements is indicdted and the choice of statement must be 
erre ‘until after the experiment in order to avoid missing important 
findings, there is every reason to believe that the true probability 
much closer to its lower bound than to its upper. Here the loss 

i jency in using tke conservative approagh cannot be very great. 
“There remains, of course, the perplexing problem of establishing 
unds when a limited number of statements are contemplated but no 
mient computation procedure is available for ascertaining 1—a. 
a possible solution is to select bounds that straddle the desired 
probab lity. For small values of k, say less*ban four, the bounds are 
not too far apart, and one might find examples for which 1 —' =.90 and 
= = 99 approximately. Perhaps this provides а working approxi- 
n for the .95 level on the hope that 1—a lies about midway be- 


ACKNOWLEDGMENTS 
t 


"wish to thank J: Arthur Greenwood and" W. Braddock Hickman 
а great deal of inspiration and advice. "However, I must absolve 
th of these gentlemen from any responsibility—particularly since 
‘of them bas reservations, ; ° 
bis t © ° 

i REFERENCES = 
cher, Maxime, Introduction to Higher Algebra, New York, The Macmillan 
Company, 1936. Н 
2] Croxton, F. E» and Cowden, D. J., Applied General Statistics, New York, 

Prentice-Hall, 1939. б x, 4 
wyer, P. S., Linear Computallons, New York, John Wiley and Sons, 1951. 
Haavelmo, Trygve, “Methods of Measuring the Marginal Propensity to 
Consume,” Journal of the American Statistical Association, 42 (1947), 105- 


[5] Mood, Alexander M., Introduction to the Theory of Statistics, New York, 
McGraw-Hill Book Company, 1950, See especially Chapter 13. | 
d Pearson, Karl, editor, Tables ef the Incomplete Gamma-Function, London, 
The Biometrika Office, reissue, 1946. 
n, Karl, editor, Tables of the Incomplete Beta-Function, London, The 
Biometrika Ofico, 1934. E e 
| Roy, 8. N. and Bose, R. C., “Sinfultaneous Confidence Interval Estimation," 
nals of Mathematical Statistics, 24 (1953), 513-36., у 3 

E 


>] 


146 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1951 


[9] Scheffé, Henry, “A Method for Judging All Contrasts in the Analysis of 
Variance,” Biometrika, 40 (1953), 87-104. 

[10] Tintner, Gerhard, “A Test fon Linear Relations between Weighted Regres- 
sion Coefficients,” Journal Ш the Royal Siatistical Society, Series В, 12 
(1950), 278-77. 

[11] Tukey, John W., ОСЕ for Various Types of Error Rates,” unpub- 
lished invited de presented before & joint meeting of the Institute of 
Mathematical Statistics and the Biometrics Society (ENAR) at Blacks- 
burg, Va., March 19, 1952. For discussion of Tukey's method see [8] and 
[9]. 

[12] Wilks, S. S., Mathematical Statistics, Princeton University Press, Princeton, 
N. J., 1946. See especially Chapter 8. _ 


> 


denm 


ү 
4 


>. +. luvin ж 


ASYMPTOTIC RELATIVE EFFICIENCIES OF DISTRIBU- 
TION-FREE TESTS OF OMNESS AGAINST 
* NORMAL ALTERNATIVES у 


ALAN STUART 
London School of Economics 
n 
1. THE MEASURE OF EFFICIENCY 


(C\EVERAL writers, notably Hotelling and Pabst [5], have explicitly 
assumed that the relative efficiency of two test statistics is to be 


measured by their estimating efficiencies. While this seems reasonable, — 


itis by no means obvious, since if the two tests are consistent, the ratio 

sof their powers against any fixed alternative hypothesis must tend to 
unity with increasing sample size n, and it may easily be shown that 
for any n, the less efficient estimator may provide a more powerful test 
(Sundrum [14]). ^. : 

Pitman [11] has proposed’ а measure of the asymptotic relative effi- 
ciency of consistent tests. Given that the two statistics, and t», have 
normal limit distributions with variances of order n~, and that certain 
general regularity conditions are satisfied, he considered a limiting 
process in which the alternative hypothesis H, differs from the null 
hypothesis Ho by a quantity of order n-¥2, so that as n increases, Hi 
tends to Ho. Under these conditions, he showed that the reciprocal of 
the ratio of sample sizes required to attain equal power against the 
same alternative was, in the limit, : 


ә 
° ° 


; T s wt | vals- ^ 


EE | 


Where 0 is the parameter whose value distinguishes Ho from Hi, and 
Hand V denote mean value and variance respectively. : 
Some such limiting process is necessary if we require a single measure 
of the relative efficiency of two tests, but it is not altogether surprising 
that Pitman's result is equivalent to the use of estimating effidiency as 


8 criterion, With ż and t as our (consistent) test statistics, let 
б T,=f(t), Tr=gh) . 


е 
be transformations of them which are consistent estimators of the 
Underlying parameter 0. For large samples, 


147 Bae 


(1) 


= 


LI 


148 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


V(t) = (Bro. 


If 7,—0(14-0(n)] and h=E(t)){1+0(n-)} where 0, є are positive 
constants, we may write,! to our order of approximation, 


af of /óh _ i ӘЕ(һ) : 


91 0 90 90 
Thus 

Ү{Т)) = V(t) 73 [= 
Similarly 

vor) = vin/ FOr’ 
Thus 


Ү(Т) _ (Bi)? Y. 
Ү(Т,) үз (E)? 


(2) 


in a shortened notation? 

At б, the right side of (2) is equivalent to Pitman’s measure (1), and 
since the left side of (2) is the ordinary estimating efficiency of the 
transformed statistics on Н», it follows that Pitman’s measure simply 

» reproduces the estimating —properties of the appropriate transforma- 

tions of the test-statistics being considered. As a simple example, the 

~ sign test for the median reproduces the efficiency property of the 

sample median as an estimator of She population mean. (‘The result is 
2/m for normal populations.) К 

Thus Pitz:zn's result on power may be regarded as a justification of 
the procedure of using estimating efficiency as a test criterion. 


2, TESTS OF RANDOMNESS AGAINST NORMAL REGRESSION 


А ALTERNATIVES 


Using Pitman’s measure, we may investigate tests of randomness for 
the standardised normal regression model 


’ y = а + ВХ: + є, ; 


! Mr. J. Durbin has considerably simplified this derivation. 


2» 


» к ; Y 


— "- 


————————— ee 


STRIBUTION-FREE TESTS OF RANDOMNESS yes 2 19 


re eis а normal vector with E(e) —0 and V(e) =1. If the values of 
are spaced at equal intervals, no generality is lost in replacing them 
he numbers 1tón. € 


— 


hich is exactly normally distributed with mean В and variance 
(52(X;— X)?]. Sinc8 we have replaced the»X; by the natural num- - 
У(Х, X)? —n(n? —1)/12 and, for the statistic b, we have 


° 
ә 
3. PRELIMINARY RESULTS 


For application in succeeding sections, we require a few preliminary 


results for the normal regression model. 
Define 


z m LEYU, o 
Ы 8 = pif ye 
Наа 1079 


(0 Now E(H;j- = Prof { Hij=1 }, and since Ore y) isa normal variate 
j^ h mean BG —j) and variance 2, this i is 


"Uo несие : 
[t \ T 3,2] ваз 
„ [ОО 
| -в8н—р0 
Ш ап obvious notation. 2 , 
a 0-0 10-0. а) 
E sae] = JB Vir Wr е 


(b) Also EI I.) =Е(Н.)Е(Н,). 


. 


150 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1054 
^. So that 


a 0, [ 
Š Bil) = н) £ Bani) + fZ a) н 


and, from (4), 


Fp) A 


У (5) 
Ру е Ph, 0+0). 
(с) Since 
d дф дф 
: qon es D an" 
we have 


= f f 2 fu, o)dudv = а fc Хаз, v)dv + (f^ flu, bx)du, (6) 


and similarly for variables at the lower limits of integration. 
Now (y:—y;) and (у;— уь) are jointly normally distributed with cor- 
relation —4, Thus 


E(H;H а) = Prob (H;; = 1, Hy, = 1) 


TE 9 
eu-n i J ад Dn Д 


in an obvious notation, so that, applying (6), appropriately modified, 


ies ОС dug ЖШ 
— E(H; Е 8-—— 
КОЛ, 


4, (туз 2 2 
i-k 
; ; US IE 7 
2 FUE (7) 
Similarly, we see that : 
T2 ape 9 
ET Енн] Po Meas (8) 
E a 


4ут 


DISTRIBUTION-FREE TESTS OF RANDOMNESS 151 


Hence, summarising results (4)-(8) and remembering that H;;—1 
identically, we have ) 
dU itk INE ; 
[= Ett.) | = IT Tit jAi and Xk 
9B 8—0 4ут E 
[вт | * b S rap d | 
ae mr = ——— ТОРА y 
m "ds 2V7 
(d) Finally, we consider the probability that the product (y:i—ys-1) 
(yii yi) is negative. This requires that one of the brackets is posi- 
. tive and the other negative, and this has probability x 


-ivi pe 0, 1 
E-2 f f N ( Е 1) ў 
= 5 32 0, 1 
and, applying (6), suitably modified, we find that the terms cancel and 
ðE “a 
[=] =0. (10) 
9B J во 


We now proceed to calculate (E^)?/V for five well-known distribution- 
free tests of randomness, where the null hypothesis is that all the 
observations have come from the same continuous population. In the 
normal regression model, this is equivalent to testing Ho:8=0. 


* 


4. THE DIFFERENCE-SIGN TEST * 
The test criterion, proposed, by Moore and Wallis [8], is simply the 
number of positive firs differences in the series, Stuart [12] has made 


explicit power calculations against normal regression alternatives. The 
test statistic is £ - 


» 


D= » Hia 
=] 
By (4), 


el: 
Bp) =. : 


' Also, as is easily shown (Stuart [12]), 


" 1 


VQ) = 52 (n + 1). 


152 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MA: 
So for D, 


5. KENDALL'S RANK CORRELATION TEST 


Mann [7] suggested the use of Kendall's rank correlation coeffici 
t, for testing randomness. We shall consider the quantity Q related 1 
by 5 


Q may be defined by 


Q-2,Hi, 
"n re 


le. we take all possible 3n(n—1) comparisons between pairs of ob- 
servations, and score unity whenever an observation exceeds a la 
observation. 


From (4), 
"EQ)- — 2,6 = 9/09) 
LI i<j 
a EES 
ут: 
Also (вее, e.g., Kendall [6]) 3 : 
V(Q) ~ n3/36 
so that for Q, Р / 
m— Dr » 
~ n / (4r). 


6. SPEARMAN'S RANK CORRELATION TEST 
Speatman’s rank cérrelation coefficient, 7,, can clearly be 
wherever t can, and has been considered by Daniels [2] as a test ag: 
trend. We shall consider the quantity V related to r, by 
Ў 7 12у 
n—1—————. 
n(n? — 1) 


> эце 


e 8 


DISTRIBUTION-FREE TESTS OF RANDOMNESS 153 


It may easily be shown (see Durbin and Stuart [3]) that V may be 
defined analogously to Q by Ў 


. Li 


Y - LG - bm, 


i<j 


ie. V is a weighted sum of the H-scores, the weights being the distance 
separating the observations contributing the score. Now 


Slay = 2:09 dE Hs} 
i<. 


and, by (4), . ; 
; AREE SU 
A Е'(У) = Wins? DE 
2 n?(n? — 1) 
24yr * 


Also (see Kendall [6]) it is easily shown. that the variance of V is 
asymptotically n5/144. 
Thus 


(Е')? 
RE m/m, * * (13) 
E : 
just as for Kendall's test. ў а 
* 
г 7, THE TURNING POINT TEST „ 


Another test, proposed by Wallis and Moore [16], consists in counting 
the number ef runs up and down in the series or, equivalently, the num- 
ber of peaks and,troughs in ¢he series. This statistic was considered 
earlier from another point of view by Bilham [1]. We Score | 

"n dt 
т. e if (ys — yix) ia — Ya) <0 
E 
0 Otherwise, 


vid the sum of ће (n—2)T; is our statistic Te Now we haveseen in 
equation (10) that 
LJ 


e a 
а DN = " e e 


Thus, for 7 also, ЕЁ 


— 
€ 


eum 


154 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 
Е'= 0 (14) 
and the relative efficiency of the test is zero. 


8, THE RANK ee CORRELATION TEST 
Since H;;=1, the rank of y; among the n y’s is 


A» у? Hi. 
jal 
Consider 


Tfh wy S Hla 


j=l 1-1 


=> НН + 25 Нани 
jui Ink à 
"n > Fina + Bog 
p 


From (9) we obtain immediately 
ð 1 AU 
Еч | = 4 БЕК ТЕТ 
[= (re) IU LLEF jt 


425-0 226-5) 


dei 


n(n 4- 1) 
EEEREN 


Now the rank serial correlation coefficient of lag s is, neglecting con- 
stants, 


Gc k-n-tl. (15) 


W= > Tite, i (16) 


Ic del o, 
when a non-circular definition is used, or 
Li 
= ИЕ 2j Т», (17) 
2 i=l 


when a circular definition is used.. These statistics are special cases, 
using ranks of the serial correlation coefficient proposed as a distribu- 
tion-free test against trend by Wald and Wolfowitz [15]. 

If in (15), we put k=i+s, tke non-constant factor becomes 
(2i —n— +1), and since 


= 
D 


DISTRIBUTION-FREE TESTS OF RANDOMNESS 155 


ns 


Qi -n—s+) =0, 
i=l 
we see from (16) that і 


д 

— za] = 0. 18 

op 8—0 CON 

Similarly, if we put k=n—s-+7 in (15), the non-constant factor be- 
comes (27 — s-- 1), and since 


хатте 


i=l 


` we see from (17) and (18) that 
д 
— EW =% 19 
[= ( At y 


Thus, for any lag, the circular or non-circular rank serial correlation 
coefficient has relative efficiency zero as a test against normal regres- 
sion. 


9. COMPARISONS AND CONCLUSIGNS 


Collecting our results (3), (11), (12), (13), (14), (18) and (19), we 
obtain, using (1), the following table for the asymptotic relative effi- 
ciencies of the tests: E 


^ 
£ ters 


Asymptotic Relative Efficiency 


Test 
" Compared to b Compared to D 

2 
Regression coefficient (b А ) Е 
Spearman’s test ( 3/т= .95 
Kendall's test Q) [a= 05 
Difference-sign test > (р) 0 а 
Turning point test (T) 9 x 
Rank serial correlation test (W) 0° ? 


The rank correlation tests are highly and equally efficient, agreeing 
with the results of Daniels [2], who considers a more general model. ; 
Our value 3/7 is not identical with the value 9/x? established for their 
efficiency as tests of bivariate independence by Hotelling and Pabst 


« To 


156 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


[5] and by Moran [9]: here we have been dealing with an essentially 
univariate problem, and since the null hypothesis is one of independ- 
ence, it is not surprising that we get for our efficiency the square root 
of the bivariate efficiency. (An explanation” of this result is given by 
Stuart [13].) When we take into consideration the computing time for 
each statistic, Spearman’s test is to be preferred, the formula 


- 550-0 


giving it a clear advantage over Kendall’s test, especially for large n. 

As will be seen from the table, all three other tests have asymptotic 
relative efficiencies of zero, although, as the last column shows, D is 
to be preferred to the other two, which have zero relative efficiency for 
any value of n. (The result for W has previously been obtained by 
Noether [10].) Care must be taken in'interpreting these results, for, 
since the tests are all consistent against this alternative hypothesis, we 
can always make the power of any of them as close to 1 as we please by 
inereasing sample size indefinitely. The situation is analogous to (and, 
as shown in section 1 above; a reflection of) that arising with a con- 
sistent, but highly inefficient, estimator. The relative efficiencies which 
we have caleulated are local properties, in the sense that they are re- 
strieted to neighbourhoods of the null hypothesis which are small 
enough, when sample size is taken into account, tô keep the powers of 
the tests bounded away from unity. Reference should be made here to 
a sampling experiment reported by Foster and Stuart [4] which closely 
bears out the results of the table above. . 

While D is very much simpler to compute thar) any of the other sta- 
tistics, this fact does not close the gap between it and V. For (11) and 
(13) show that V has an efficiency advantage of order »?, while if 
computing time is proportional to the number of comparisons to be 
made between observations, the advantage to D with (n—1) compari- 
sons, as against 4n(n—1) for V, is only "of order n. 


> 


10. SUMMARY 


It is shown that against normal regression alternatives, the two rank 
correlation tests are to be preferred to three other distribution-free 
tests, these being the difference-sign test, the,rank serial correlation 
coefficient test and the turning point test. Further, on computational 
grounds alone, Spesrman's rank correlation test is to be preferred to 
Kendall's test. These results do not, o$ course, apply to other alterna- 
tive hypotheses, such as the presence of serial correlation. 


E] 


DISTRIBUTION-FREE TESTS OF RANDOMNESS 157 
ACKNOWLEDGMENT 


I should like to express my thanks to Dr. H. R. van der Vaart of 
Leiden Universityefor a close critical,reading of the first draft of this 
paper, which led to the removal of a taster of errors. 


REFERENCES А ү, 
1] Bilham, Е. G., “Correlation coefficients,” Quarterly Journal of the Royal 
Meteorological Society, 52 (1926), 172. 
2] Daniels, H. E., “Rank correlation and population models,” Journal of the 
Royal Statistical Spciety, Series B (Methodological), 12 (1950), 171-81. 
Durbin, J., and Stuart, A., “Inversions and rank correlation coefficients, " 
Journal of the Royal, Statistical Society, Series B (Methodological), 13 (1951), 
303-9. а : 
4] Foster, F. G., and Stuart, A., *Distribution-free tests in time-series based 
on the breaking of records," Journal of the Royal Statistical Society, Series 
B (Methodological), 16 (1954). 


3 


tests of significance involving no assumption of normality,” Annals of 

Mathematical Statistics, 7 (1936), 29-43. C 

6] Kendall, M. G., Rank Correlation Methods, London, Griffin, 1948. 

7] Mann, Henry B., “Non-parametric tests against trend,” Econometrica, 18 

(1945), 245-59. - 

8] Moore, Geoffrey H., and Wallis, W. Allen, *Time series significance tests 
based on signs of differences," Journal of the American Statistical Associa- 
tion, 38 (1943), 153-64. 

9] Moran, P. A. P.» “Partial and multiple rank"vorrflation," Biometrika, 38 
(1951), 26-32. Ы 

10] Noether, Gottfried E., “Asymptotic properties of the WaldsWolfowitz test 
of randomness,” Annals of Mathematical Statistics, 21 (1950), 231-46. 

[11] Pitman, E. J. G., "Lecture notes on non*parametrie inference," (Univ. of 
N. Carolina, mimeographed, 1948). . P Е 

12] Stuart, Alan, “The power of two difference-sign tests,” Journal of the 
American Statistical Association, 47 (1952), 416-24. 1 

13] Stuart,-Alan, “The correlation between variate-values and ranks in samples 
from a contipuous distribution,” British Journal of Statistical Psychology, 
(to be published), 

14] Sundrum, R. M., “Theory amd application of distributioni-fsce methods,” 
(Unpublished Ph.D. thesis, University of London, 1953). 

15] Wald, A., and Wolfowitz, J., “An exact test for randomness in the non- 
parametric case based on serial correlation,” Annals of Mathematical Sta- 
tistics, 14 (1943), 378-88. К 

16] Wallis, W. Allen, and Moore, Geoffrey H., “A significance tes? for time- 

As analysis," Journal of the American Statistical Association, 36 (1941), 

01-9. 


5] Hotelling, Harold, and Pabst,"Margaret Richards, “Rank correlation and - 


ESTIMATION OF THE POISSON PARAMETER FROM _ 
TRUNCATED SAMPLES AND FROM CENSORED 
SAMPLES* à d 
4 


А. C. "ConzN, Jn. 
University of Georgia 


Maximum likelihood estimators of the Poisson parameter 
applicable to both truncated and censored samples are de- 
rived in this paper. Singly and doubly truncated samples as 
well as singly and doubly censored samples are considered. 
The estimators obtained are presented in simple algebraic 
forms and their application to practical problems with the 
aid of standard Poisson tables is illustrated with numerical 
examples. Asymptotic variances of estimates for the different 
cases considered are obtained from second derivatives of the 
likelihood functions and axe simplified to forms which permit 
ready evaluation. 


1, INTRODUCTION 


studying such diverse classes of discrete data as haemocytometer 
counts of blood cells per square, the number of noxious weed seed per 
unit of field seed, and the number of defects per unit of a manufac- 
tured product. It is thus’ of interest to the biologist, the agronomist, 
and the quality control engineer as well as to research workers in vari- 
ous other fields of scientific endeavor. When sample observation is per- 
mitted over the full range of the complete distribution, the estimation 
problem is quite simple. In that сазе, the maximum likelihood estimate 
of the population. parameter is the sample mean. When the sample is 
; truncated or otherwise restricted, as for example when the number of 
zero observations is unknown or when observations of higher counts are 
pooled, the estimation problem increases in complexity. Variotis aspects 
of estimation-involving singly truncated and singly censored Poisson 
samples with known terminals have been considered by Tippett [7], 
Bliss [1], Rider [5], David and Johnson [2], and by Moore [4]. Accord- 
ing to terminology which has recently come into popular usage, trun- 
cated samples are understood to be those from which the number of 
observations eliminated by the restricting process is unknown. Cen- 
sored samples are those in which the total number of sample specimens 


* Preliminary report presented before American Mathematical Society, Auburn, Alabama, 
Noveuiber 23, 1951, Abstract published in Bulletin of the American Mathematical Society, 58 (1952), 60 
Sponsored in part by the Office of Ordnance Reserch, U.S. Army, under contract DA-01-009- 


ORD-288. 


T Poisson distribution is an appropriate mathematical model for 


158 


ESTIMATION OF POISSON PARAMETER 


_ is known, but measurements on some of this numbe; acki 
_ sored samples may thus be regarded as truncated samples having 
ы known number of unmeasured epe observations, In this paper 
~ and in the references cited, interest Не; only with samples truncate 
_ or censored such that all observations above or below specified te: 
"nals are either eliminated or unmeasured; that is, with samplesin which — | 
~ the restriction applies only to the tails of the sample. The classification 
7 of single or double indicates whether one or both tails have been re- 
stricted. Tippett obtained the maximum likelihood estimator for а sam. 
ре that is singly censored on the right, but left his results in a some- 
what unwiéldy form for practical application when more than four of — 
the individual frequency classes are available. For four or less frequency — 
glasses, he provided nomograms to aid in computing the required esti- . 
mates. Bliss developed an approximation to Tippett’s estimator and 
T Provided two tables necessary fór applying his procedure. Moore was — 
also concerned with this case and developed an estimator based on 
L-sample moment functions. Rider developed an estimator based on 
| moment functions for the case of samples singly truncated on the left 
"and also considered maximum likelihood estimation for the same case. 
David and Johnson were likewise concerned with maximum likelihood А 
T estimation in this latter case when the zero frequency class is the only S 
One missing. The present paper is more general and of greater extent 
than the references cited above. It is concerned with maximum likeli- 
T hood estimation from singly and doubly truncated samples as well as 
"from singly and doubly censored samples, all with known terminals. ] 
‘Estimators derived here are expressed in ‘simple algebraic forms for | 0 
easy application to préctical problems. s ў 


a 2. THE POISSON DISTRIBUTION | 
The coniplete Poisson distfibution function may be expressed as — 
© E m + 
f, т) = 57, TEOL A 


: E JG, m) is thus the probability of observing exactly z occurrences 
Of the 


3 event studied. The cumulative probability that c or more occur- . 
LIences will be observed, may be written as 


е 
е -s > e 


| P(c, m) = У) 


ac) 1 


160 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 10! 


As stated in the introduction, the maximum likelihood estimate of m _ 
based on a complete (unrestricted) random sample is given as 


9 4 


(3) = b= X an 


where n is the total number of sample observations. In what follows, — 
corresponding estimators are derived for various types of truncated — 
and censored samples. Where no confusion will likely result, the nota- _ 
tion is simplified by writing f(x) and P(c) in place of the longer fla, т). 
and P(c, m). 2 " 
3, TRUNCATED SAMPLES—NUMBER UNMEASURED (MISSING) | 
OBSERVATIONS UNKNOWN 


Doubly truncated. The probability function of a Poisson distribution — 
truncated on the left at cc and on the right at z—d, may be written - 
as Я 


9 = 0, E, 2 «6 
(4) f(z) = [P(9 — Pa + 1) p < ога 
fle) = 0, 5 zd 


The truncated distribution (4) is thus ева so that 


» è 


6) Lie) > Lf9- 


The likelihood notion of a random sample of n observations from P 
population distributed according to (4) may be шел ав M 


(6) Р(а, 8; ---,2) = [P() — PEH 1) rene [T Il nl] 


We obtain this same likelihood function when we consider the popula- ' 
tion as,being complete with frequency function (1) and consider the | 
sample as being truncated with the restriction that sample observation 
can include neither count nor measurement beyond the truncation 
points. In general, the writer prefers to view the truncation or other - 
restrictions as Beg imposed on the sample rather than on the popula- 
tion. 


ESTIMATION OF POISSON PARAMETER 161 


Taking logarithms of (6) and writing nz in place of Y, where = 
is thus the mean of the truncated sample, we have * 
` . 


(7) L--—nln [P() — P(4-- ] -hm-Enz In m—In [ 11]. 


Differentiating (7) аһа equating the result to zero, we obtain the es- 
timating equation 


" 1d. 2 а 
«^n dm m Ре) — Pd +1) 


' For a clearer understanding of how (8) was obtained from (7), the 
following details are included on determining dP(c)/dm. From (2), we 


have d 
о emm? gm 
d —— а! -ә-== 
dP(c) [> z! ] > [ z! 
dm dm а ат | 
© e "ym! — e"m” 
[= 
в zt 
x Ў «672-1 E © em ъ © 
с (2—1)! И mere 
ы утуу 2 © mmm? 

= Z E «Pe - 1) - Pe, 

e 0 aT! ғ; тї о А E 
and finally T Ur 
(9) = ФР ee 

ат 


w 
e 
With the aid of a set of Poisson tables such as those of Molina [3], 
equation (8) can be solved for the required estimate, m, by elementary 
чегайуе procedures, one of which is illustrated in Section 7. 
Singly truncated on the left. For a sample thatris singly truncated on 


the left, the estimating equation (8) becomes 
(10) оза MTS 
n dm m. P(e) 


е here, do, limy..f(d)=0, and limg..P(d+1)=0. With c—1, 


* є n fe 


vA e 


162 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


estimating equation (10) is applicable in the special case of an unknown 
number of zero observations, the case considered in [2]. 
Singly truncated. on the right. In this case, c+0, f(c— X) =0, and 

P(c) =1. Accordingly the estintating equation becomes 

1 dL & 
(11) eg), 

n dm m 1-Р +1) ~ 

4, CENSORED SAMPLES—NUMBER UNMEASURED OBSERVATIONS 
IN EACH TAIL KNOWN 


Doubly censored. Let n; and n4 be the number of unmeasured ob- 
servations in the left and right tails respectively and let n be the num- 
ber of measured observations for which c<x<d. The likelihood func- 
tion for a sample of this type drawn from the population (1) is 


P(@i, %, > +: , igi ad 


Я | Yu 
La = K[1 РОЈ" =] [Pa + DI", 


where K is a constant, and other symbols are as previously’ defined. 
Taking logarithms of (12), differentiating with the aid of (9) and equat- 
ing to zero, we obtain the estimating equation 

Lobos p L3 i 
Из сеу ESL Sy S EL I So 

n dm m’ — Po). P(d + 1) 
: Singly censored on the leftIn this case n=0, and the priming equa- 
tion (13) becomes: ) о 


а ITAA mf fe- 1) 
14 ——-—-—1- 
(14) DAT Ll 2 Lb 0. 


When c=1, a singly censored sample with the number of unmeasured 
observations known is actually a complete rather than a restricted 
sample since i is simply the number of zeros in the sample and the 
total sample size is n+mı. In this case, equation (14) becomes 


? ш сш а HO) o, 


nm ¥(0) 
and 
m= x 2j = 2, 
п+т 


> 


ESTIMATION OF POISSON PARAMETER - i М 163 


which agrees with equation (3) for a complete sample, : 
Singly censored on the right. In this instance, ти =0, and the estimating 
equation ¢13) becomes П 


1 dL i т 
(15) ses +e 

n dm m Р( + 1) 
This is recognized as the appropriate estimating equation for the case 
in which all sample observations for which z»d have been pooled. 
Tippett’s estimator (ioc. cit.) applied to this case. 


5, CENSORED SAMPLES TOTAL NUMBER UNMBASURED OBSERVATIONS 
KNOWN BUT NOT THE NUMBER IN EACH TAIL SEPARATELY 


Let c and d designate the terminals as in Sections 3 and 4. Let n be 


the number of measured observations for which c Sz €d, and ny the . _ 


combined number of unmeasured observations in the two tails. The 
likelihood function for a sample of this type from the population (1) 
is then 


P(a1, 22, * ++, зыт) E 
16 gm 28 
ES = K[1 — P(e) + P(d + andi 


Taking logarithms, differentiating and equating to =e, we have the 
estimating equation 
14а 2 А /(с — 1 
(17) е pee ed 
d m 1— P(c) +P@+V 


n 
We note that the singly censored gases in this instance are identical 
with those of Section 4. When censored on the left, по= т, limaz«f(d) 


/ 70, and lim. P(d--1) = —0. Accordingly, (17) assumes the same form 


as (14), When censored on the right only, йо=т, c=0, 7(c—1) =0, 
P(c) =1, and (17) xe to the same form as (15). 


9. VARIANGE OF ESTIMATES 


Since maximum likelihood estimation ha$ been employed, the 
asymptotic variance of i can he expressed as 


(18) тае Е : 
aria) A dm? ] meñ : 
Second derivatives for, the various cases considered are given below. | 


^ 
x VE 


164 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 — | 


TRUNCATED SAMPLES—NUMBER MISSING 
OBSERVATIONS UNKNOWN 


а 


Doubly Truncated } " 
ES E 090 [E —2) —f(c — 1) — fd — 1) Uum 


19) n dm? m? P(e) — P(d + 1) 
[eo 1 
P(c) — P(d +1): 
Singly Truncated on Left Э rf 
Bu e f(c — 2) — f(c 1) д f(c — 1)7? 5 
о ты [ PO) ] [ P(e) 


Singly Truncated on Right 


ID E Fa — 1) — f(d) J(d) н 
(21) "dmi secs =] 


CENSOREP>SAMPLES—NUMBER OBSERVATIONS 
IN EACH TAIL KNOWN 
> 
› 
Doubly Censored 


о 


Quo gu Eu у Sony) a 5 JE | 
(22) n dm m* m 1 — P(e) 1 — P(e) 
mtd — 1) — fd) Kd w p 
é ur P(d 4- 1) Boo] 1 
è > f 
Singly Censored on Left 
1 dL $ m f(c — 2) — fle — 1) fle = 1) 2 | 
28) — — = _2_™ | 
(28) w dm? m” n [ 1 = P(o) т ( E up | 


? 


2 141° ғ һа 1-79 / fd) w 
(A ка ез 2 - 
n dm? ы Pcr c с E 


S S » aye? 


ESTIMATION OF POISSON PARAMETER 165 


CENSORED SAMPLES—COMBINED NUMBER 
QBSERVATIONS IN TAILS KNOWN 


is ee 1) — f) - fie - 2) 4 f - » 
1- PO 4 P * ) 


. 
Derivatives given in this section can be evaluated with the aid of 
Molina's Tables as previously mentioned. 


CR 7. PRACTICAL APPLICATIONS 


For each case considered in this paper, the estimating equation can 
be solved for M without too much difficulty by а simple inverse inter- 
polative process, provided a set of Poisson*tables such as those of 
Molina [3] are available. As a first approximation to m the sample mean, 
, or some obvious modification thereof, will prove satisfactory in 
many instances. Where applicable, estimates given by Moore (loc. cit.) 
or by Rider (loc. cit.) may provide closer first approximations. The 
illustrations which follow serve to clarify these points. 

The data of Table*1, due to Rutherford and"Gelger [6], will be used 
to illustrate how estimates are calculated from samples. These data 
concern the number of a particles observed in an eighth of a minute 
time interval. Observations were recorded for 2608 such intervals, with 
х designating the number of particles observed during an interval, and * 
| Jo(x) designating the frequency or number of intervais during which 
| these observations were made. Observations of nine or more particles 
- per time interval were pooled. Various cases considered in this paper 

are illustrated with these same data by appropriately changing basic 


assumptions regarding the sample. De + 
TABLE 1 
Particles per Interval, z o| 1| 2| 3| 4| 5| 6| 7| S|9andover || Total 
| "cames |е— ==] . 
| Number of Intervals, fa(2) 57 | 203 | 383 | 525 | 532| 408| 273 | 129 |45| 48 2608 
*— 
* Illustration 1 e ° 


Sample singly censored on the*right. We first use the maximum in- 


formation provided .by the sample which includes n=2565 measured 
M LI 


5 ^ а 


€ KE 


166 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH Ии 


observations and n,=43 unmeasured or pooled observations in 
sample right tail. The terminus is at d=8, апа 2.ozfo(z) = 9683, 
Accordingly, £= 9683/2565 =3.7750. The appropriate estimating equa- - 
tion is (15) which we solve, using as a first approximation, m=£=3.8 
(rounded off). From Molina’s tables, we have f(8, 3.8) =0.024123 and 
P(9, 3.8) =0.015984. On substituting these values іп (15) we obtain —.— 


1 dL 3.7150 43 [0.024123 
— — —— — 14 ——|——— | = + 0.01872. 
n dm wis 3.8 + 2505 ad T 

Similarly, we compute Y 4 
1 dL 715 { - 
ST OMIM - 58) - — 0.00775. 5 
n dm | meso 3.9 2565 L0.018533 1 


Z 
To determine the required value, fi, for which (1/n)dL/dm =0, we _ 
interpolate linearly! as summarized below. 


l dL 
m -— — 
n dm 
3.800 +0.01872 
3.871 0.00000 
3.900 —0.00775 
Pact ta RAN ARENA EB ou Uoc 


Thus our estimite is fii 93.871, and using (18) and (24), we compute _ 
о Vi) 0.04. 
Illustration 2 


Sample singly truncated on the right. Here, we neglect пгапа assume — 
that the number of observations in thé truncated tail is unknown. 
Otherwise the sample remains the same as for illustration 1. Estimating — 
equation (11) is applicable in this instance, and on substituting the 
- necessary values, we have 


› 
> 


3 dL 3.7750 0.024123 

n ii Coda. кыйн 75 001796 
E - 5900 = 1 49:020800 _ = — 0.00468. 
div less, 8.9 1 — 0.018533 а 


1 
n 
! More precise interpolation formulas 
DP eins the Todas involving second and higher order differences might be re- 


Е 


E 


А A zoo roses = — 0.00817. 
.. , . 2305 L0.018533 


ESTIMATION OF POISSON PARAMETER А 167 


Interpolating as in the previous illustration, we have #ў=8.870, and 
from (18) and (21), o@~0.04. 
* 


Illustration 3 


Doubly truncated sample. For this illustration, we arbitrarily elimi- 
nate the first two classes of Table 1 in addition to the pooled classes for 
measurements greater than 8, and assume the number of missing ob- 
servations to be unknown. In this instance the terminals are c=2 and 
d-8. Completing £he sample summary, we have > t.g7fo(«) =9480, 
n=2305, dnd £=9480/2305=4.112798. The appropriate itin 
equation is (8) and ftom Molina's tables, we find (1, 3.8) 0.085009, 
f(8, 3.8) =0.024123, P(2, 3.8) =0.892620, P(9, 3.8) =0.015984, f(1, 3.9) 
=0.078943, f(8, 3.9) =0.026869, P(2, 3.9) =0.900815, and P(9, 3.9) 
=0,018533. On substituting these values in (8) we have 


1 dL _ 4112708 _ | _ [0,085000 аш 
ndm ае ШЗ В 0.892620 — 0.015984. 
= + 0.01286, 
1 db 4.112798 0.078943 — 0.026869 
п dm | woso ЭВ ИЩ nd E isn] 
= — 0.00446. ae 
Interpolating, we have ñ «3.874, and from (18) and (19), em~0.05, 
Illustrationd 


Sample doubly censored with number ufimeasured observations in each 
tail known. Data for this illustration are the same 28 for illustration 8 
with the added information that,n:=260, and n, 43. The applicable 
mate equation is (13), and on making necessary substitutions, we 
obtain 


1 dL 4.112798 260 p 0.085009 ] 
n атаа. 38. E imo 
тл үсү: 
2305 [0.01 
1 dL + 4.112798 ar 0.078943 ] 
n dm lms 3:9 2905L1,— 0,900815]. 


е < 


< [4 


e 


168 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 
Interpolating, we find 9 =3.872, and from (18) and (22), cm~0.04. 


Illustration 5 i 2 


Sample doubly censored with total number unmeasured observations 
known, but not number in each tail separately. For this illustration we as- 
sume the knowledge that no = 308, but that л апа n; separately are un- 
known. Otherwise, the data remain the same as for illustrations 3 and 
4, Estimating equation (17) is applicable, and after making the neces- 
sary substitutions, we have Y 


1l dL d 4.112798. 2I 0.024123 — 0.085009 ] 

n dm | mess 3.8 2305 L1 — 0.892620 + 0.015984. 
= + 0.01744, 

a Ed . 42.112708 _ : + 0.026869 — 0.078943 ] 

т dm | mas. 3.9 >? 2305 L1 — 0.900815 + 0.018533 
= — 0.00359. 


Interpolation yields m =3.883, and from (18) and (25), om~0.05. 

We dispense with illustrations of the remaining sample types since 
solution of applicable estimating equations can be accomplished in the 
same manner as in the fixe illustrations presented. To solve estimating 
equations for any of the cases considered in this paper, standard itera- 
tive procedurës sugh as Newton’s method could be employed, but for 
simplicity and ease of application, the interpolative procedure illus- 
trated seems preferable. SERIE > 

> » m 
5 REFERENCES 
[1] Bliss, C. L, “Estimation of the mean and its error from incomplete Poisson 

distributions,” Connecticut Agricultural Experiment Station Bulletin 513 

(1948), 12 p, t 
[2] Davie, F-N., and Johnson, N. L., “The 5runcated Poisson," Biometrics, 8 

(1952), 275-85. 

[3] Molina, E. C., Poisson's Exponential Binomial Limit, D. Van Nostrand Co., 

Inc., (1942). à 
[4] Moore, P. G., "The estimation of the Poisson parameter from & truncated 

distribution," Biometrika, 39 (1952), 247-51. 

[5] Rider, Paul R., “Truncated Poisson distributions," Journal of the American 

Statistical Association, 48 (1953), 826-30. " z 
[6] Rutherford, E, and Geiger, Hans, “The probability variations in the distribu- 

ti on of a particles,” Phil. M. agazine Series 6, 20 (1910), 698. > 
[7] Tippett, L. H. C., “A modified method f counting particles,” Proceedings 

Royal Society, Series A, 187 (1932), 434-46, 


Ў » 
› 


TABLES OF THE EXPECTED VALUE OF 1/X FOR 
POSITIVE BERNOULLI AND POISSON VARIABLES 


Epwin L. Gras AND I. RICHARD SAVAGE 
б National Bureau of Standards 
SUMMARY OF TABLES 


не random Yarjable X is said to have a positive Bernoulli distribu- 

tion [11]' if the probability that X =z is equal to (2)p*g"-*(1—q7)-1 

for 7=1,'2, - - - , n where q-1—p and 0<pX1. Similarly the variable 

X is said to have a positive Poisson distribution if the probability that 
X=z is equal to e"(1—6-7)7!m*/z! for z—1, 2, - - - , and m>0. 

Table I gives the values of E(1/X|n, p) to five decimal places where 


(a) Ea/X |m p) = 0 = eX (nee (51-0 
zal NT X 


for the following values of the parameters: 
2(1)20; p = .01, .05(.05).95, .99 
^ = 21(1)30; р = .01, .05(.05).50. 
Table II gives values of E(1/X|n) to five decimal places, where 


n 


(2) E(1/X | m) = е"(1 — os те), 
е z=] 


е 
% 


for the following vatues of the parameter: 
m = .01, .05(.05)1.0(.1)2.0(.2)5.0(.5)7.0(.1)10(2)20. 
E e PREPARATION OF TABLES 


Table I was originally prepared directly from (1) using Tables of the 
Binomial Probability Distribution [9]. Subsequently the table was 
checked by using the requrrence relation 


Е(1/Х | nap). 


ee 

n+1 gs qui 
LI LJ 

Table II was prepared directly from (2) making use of tables of the 

Poisson distribution [4], [5], [6]. Some of the values were checked by 


* This article by Stephan contains muchttnaterial of interest related to the contents of this paper 
In particular it gives a precise formulation of the mathematical situation in which these tables are ap- 
Plicable in sampling ProblemseAlgo it presents many more results of interest. 
e 


= ^ 169 


G EQ/X|n +1, p) = 


< PS 


170 ` AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 | 
the use of 

(4) Е(1/Х | m) = [Ei(m) — y — log. m]e-"/à — e-™), 
making use of [7], [8], and [10]. 


The entries in Tables I and II were obtained by two independent 
computations to insure five-decimal accuracy. ? 


USE OF TABLES 
The following situation often arises in sampling problems. 


An observation y; (i=1, - - - , x) is made on each of z individuals and 
the average, ү $ 
Y= (ny ys, Ji 
is computed. 


If the y,’s are independent observations on a random variable with 
mean value џ and variance c?, we find that the mean value of Y is y. 
. However, if z, the sample size, is a random variable, then the variance 
of Y is not o7/x but is c*E(1/ X). Thus one use of the tables is in finding: 
the variances of means when the sample sizes are random variables 
with either positive Bernoulli or Poisson distributions. Extensive dis- 

cussions of the above situation can be found in [2], [3], [11], and [12]. 

Some typical situazior3 where X has a positive Bernoulli or Poisson 
distribution are: › 
1) In estimating the average number of acres per farm planted with cotton, 
of those farms of a sample having any cotton planted. 

; 2) In estimating the average weight of animals that will survive a certain 
experiment, where the probability of an animal dying is constant for each 
animal and independent of the other animals. 

3) In estimating the average cost of fires in a certain city by examining the 
cost of all fires that occurred in a short time interval. 


Tn using the tables one must have good estimates of p or m. How- 
ever, the above examples are typical in that they cover situations where 
one is likely to have good estimates of the important quantities, i.e., 
proportion of farms growing cotton, the lethality of an experiment, and 
the average number of fire alarms per day. 

In some sampling problems the sample size follows other discrete 
distributions, such as the hypergeomet-ie. Tables of E(1/X) for this 
distribution would be difficult to prepare since the distribution itself 
is sc poorly tabulated and since there are three parameters involved. 
Hence, if in dealing with this distribution one feels that the binomial or 
Poisson approximations are not adequate, then one could perform the 


) 


f 


^ 


\ 


К. 


1/X FOR POSITIVE BERNOULLI AND POISSON VARIABLES * Ber vis 


desired computations for the specific situations at hand. For some dis- 
crete distributions the formula for E(1/ X) is simple, as for example the 
Geometrie distribution. à 


APPROXIMATIONS 


Stephan [11] gives several methods for computing (1), but none of 
these gives a simple approximation for the entire range of parameters 
covered by Table I. His approximations are advantageous to use for’ 
larger values of n than those covered by the present tables (See his ex- 
amples). . Ў 

Finkner [3] suggests ог large values of np that the following rela- 
tionship will hold: eat 
6) 1/np < E(1/X |n, p) < 1/(np — 1). 

In preparing Table I it was noted for large values of np that a very good 
approximation to E(1/X|n, р) is given by *, 

(6) l/(np — q). 

We are told by one of the referees that (6) also appears in an unpub- 
lished manuscript of W. A. Hendricks. The bounds in (5) and (6) suf- 
fer from the disadvantage that there is no theory as to when they are 
good approximations. On the other hand it is clear &hat sometimes they 
are poor since 1/(np—1) can take on negative values and 1/np, 
1/(np—q), and 1/(np—1) can take on values larger than ene. 

As an immediate consequence of the Schwarz inequality we find for 
any positive random variable that the following inequality is valid: 

е e = 


(7) E(1/X) = 1/E(X). < 


This inequality in the case of the positive Bernoulli and Poisson distri- 
butions besomes, s 


(8) Е(1/Х | пр) = (1 — т)/пр Ты. 
and 
(9) EQ/X| m) > (1 — e7)/m, 


respectively. In (8) the equality holds if either n-lor р=1. * 

Using the inequality • s 
(10) , 1®&1/@ +1) +8/@+ G2), . 
Which is valid for 22:1 we have for random variables which cannot ec 
on values less than one the following inequality: 


ne H 


ir 


172 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


(11) EQ/X) s Е[1/(Х + 1)] + 37[( + 1)(X + 2)]. 
In the case where X has a positive Bernoulli distribution this gives 
Е(1/Х | np) < [pm + DA — e)]3[à — Pa |» +1, р)) 
+ 3(1 — PQ|n + 2, р))/(® + 2)p] 
where РО] т, р) is the probability that а Bernoulli variable with para- 


meters r and р will be less than or equal to 7. Finally when X has a posi- 
tive Poisson distribution we obtain j 


EQ/X|m) < [1 — e)n]?[a — Р |m) 

30 — PQ|m)/m], 
where P(| т) is the probability that a Poisson variable with parameter 
m will be less than or equal to 7. * 

This paper is primarily concerned with obtaining exact values of 
E(1/X|n, p) and E(1/X|m). Techniques for finding the limiting dis- 
tributions and moments of reciprocals of positive Bernoulli and Pois- 
son random variables are available in sections 27.7 and 28.4 or [1]. 


(12) 


(13) 


INTERPOLATION AND EXTRAPOLATION 


One entry in each,column of Table I bears an asterisk. For values of p 
equal to or larger than the one corresponding to the entry with an 
asterisk, the approximation 1/(np —g) is accurate to at least two deci- 
mal places and usually to two significant figures; and, in general, if 
^p? 10, it has been found that 1/(np -«q) gives at least two-place ac- 
curacy. These statements are empirical. Although it has not been pos- 
sible to derive these facts mathematically they have been observed in 
many computations. ? Ё 

If 1 €10 and p is above the asterisk, linear interpolation gives re- 
sults accurate to two decimal places, 

For the cases not covered in the two preceding paragraphs more 
complicated interpolation formulations might be advantageous to use. 
In particular when np or m are greater than 5 it has been noted that 
interpolation of the functions pE(1/X|n, p) and mE(1/X | m) give bet- 
ter results than linear interpolation. 

For values of n greater than 30, it as been found that if np «2 one 
may set np—m and use Table II. This method seems always to yield 
atleast two significant figures. » 

Formula (3) is easy to apply and ау be found useful for extending 
Table I to other values of n as needed.- 


2:522» 


TM á 


3 


3 


i» Quom 


WA 15 аршу 


! 


X 
х FOR POSITIVE BERNOULLI AND POISSON VARIABLES 173 


ACKNOWLEDGMENT 


: ‘Thanks re due Lola S. Deming for her careful checking of the ta- 
_ bles, and for correction of a number of errors found in the original 
. . values. 


D 
i 


° „ REFERENCES 


[1] Cramer, H., Mathematical Methods of Statistics. Princeton, N. J.: Princeton 
University Press, 1951. 

[2] Deming, W. E., Some Theory of Sampling. New York: John Wiley and 
Sons, Ine., 1950, 449-54. 

[8] Finkner,'A. L., “Further investigation on the theory and application of 
sampling for scarcity items,” Institute of Statistics, University of North 

M Carolina, Mimeo. Series 30. 

[4] Fry, T. C., Probability and It's Engineering Uses, D. Van Nostrand Com- 
pany, Inc. 458-62 (1928). » 

[5] p T., Tables of Poisson Distribution, Biafukan, Tokyo, Japan, 

| 1952). 

[6] Molina, E. C., Poisson’s Exponential Binomial Limit, D. Van Nostrand 
Company,.Inc., (1947). 

[7] National Bureau of Standards, Tables of Sine, Cosine, and Exponential 

ET Integrals, 2, MT6, U. S. Government Printing Office, Washington, D. C, 

| _ 18] National Bureau of Standards, Tables of Natural Logarithms, MT7, MT9, 


MT10, MT12, U. S. Government Printing Office, Washington, D. C. 

[9] National Bureau of Standards, T'able of the Binomial Probability, Distribu- 
lion, Applied Mathematics Series 6. vui 

[10] National Bureau of Standards, Tables of the Exponential Function e”, Ap- 
plied Mathematics Series 14. л Pipes 

[H1] Stephan, F. F., "The expected value and variance of the reciprocal and 
other negative powers of a positive Berpoullfan:variate,” Annals of Mathe- 
matical Statistics, 16 61945), 50-61. ы ° 

[12] Horvitz, D. G., Ratio method of estimation in sample survéys, Unpublished 
thesis, Towa State College, 1953. 


Nga > . 


e 
e aw 
© 
e Ф 
. ° 
° е е 
М 9 
B 
ee 6 ү 
. 
= . Pr Ae 


"uons[odurxq pus uonejodzeju] реро uoroos oos ‘поцепа: зо] a. 


с6160' ©1101` 86611` SPIZE TSYTYI 69891* £8606* 888967 $7898" »06609* 66° 
06960" 68901" PLATT 6981" GOLIST” ELLE OPES" £0896" *99098` с92%9` 96* 
9I601* ars EZSZT' STIPT” TSIO9T" 99681* *16826` #21686" 68168" 1606" 06° 
Z6801° 90061" 92881" TOrer* 152740 ©4506” VOLES" £4818" 809cry* 7089" 
89911" €186l1* *606ў1` +86291" +6981" +1803" 62893" 881y£* 1669?” 99999” 
0992т` 91881" POSST' 04921* ©8606” 68683" 1166&` 39628" 0000s" 00002" 
8098T* a 48081 99891 ISI6I* 82126` $0893" 8158" 57807 ` 16868” 21082" 
+?987Т` +19791’ 211981` 6r0Ic* 987?` 4067" ©9966” €69vv^ 46949" 95692" 
T9891" LLISI” 1%706` 8788с` LYTLE” E28" 99т6е' СР98Р' 86019” 149882* 
A1681* 6860&` 98S" SPI9S" 82808" 92698" Teer” ©9866” 0Є669` #2018" 
17907" TI6co* 24897" 08962" ?6178` [4310.4 88917" М4 М 87069” ©6668” 
18ў®©` 70с92` 82963" ?0988` 16$986` 12144 TPES" 9L919° ZL9GL" #8768" 
Ep317 12808" 9TOT£* 9e782'- 95169 92008` 72929 92199" 06192'* 00928" 
81028" 0798 ' 62768" 2907F" Ther” 66996" 46939" £8904" T6961' 6868" 
01088" TYL91y" 6189h° [4040 96/66` ©8019` €9089* 8018/` 22828` 92116* 
14gey* 26067" 78189" 29929" 62929" 90829" 63721" TL962° 98098" 29826` 
[4:52:00 с8929` 09719" T9799* 81469* Отс?" 07682" ,86888` 14068* ??Ү?б` 
ALGY9* $1849" 92701" £9/21* 911244" 80208" 2278“ 21188° £8616" 97696" 
96991* 89444" 09008" 007c8* = 18L8* 02с28` 96968" +1236" €eLLvV6* 89&/6` 
€99/8* 19488" 64668* 00316" т87с6° TLE86" 02646" 84196* 1447/0 81486* 
00926" 67426" 86626" Lyc86* 16986" 1ў#586` 16686` Lv606* 86766" 67266" 
| 
ТІ or 6 8 2 9 s іа 8 [4 
iA D 
= 


(@—1= 5) „35.4, z() T (bT) = (4 uxa | 


I Wav 


61890' €1990* 97690* 21890" 62190" 1ёб/0` 22220" 

98980" 99880" Z1z90° 60990* YT040* 0900. 22380" 

98890" +1290 Z82990" 46690" 29720" 90050" 82980" 

#8790" 90990" 66690" CPPLO' " 9620" TZS80" 18160" 

£2990" 08020" [22219 87640" 88780: S0T60* 9ё860` 

€STL0° 69920" #1080" 82980" ^ ZIT60° $8260" ^ T9SOT" 

800° SvIS0' £7980" 10%60° 218860" 89801" LITT’ 

18880" 88880" 61860" 16860" 68901- бутт” 6cycl 

09980" 92160" 89960" 90017 - — 8860I' © = LOLIT 00921" 52198 
9860" S99001* Sv901* тетт" 910812 Sy6cT »C96g1* +16181: 
66901* Seri 0/8тт` 62921" 86781" Oster 88991" 6691: 
PS6IT* EPIT ener 96cy1* »66ZS1° +8979" 96121 c1g61* 
` ATLET’ [14228 FIST: а S99ZT" 7t061* 61902" [J 744 
*POI9T* *$80LT" »06181* E2218 89802" £09cc' Z8EFS* 18993" 
ZLOT’ 28902" 88022" gosez’ ©8006 [142748 £8763" 18618" 
[1027 86892" ©08/&` 7862" SIFTS" 29988" 9198* 0©686`° 
09278" 92688: 02788" 89928" geses’ а "Zeer" 98447" 
1g8cy' с09ӮӮ' 99997" E7287" 10016* 118" 86699" 16989" 
T6929" 09769" 28219" 8819" 2809" 690/9” 80169: OTIL . 
C66LL' 95884' [22 8 02908" . 60218" 69388 02078" 16198" 
22296" £7996" 69196" 91096* 29296" 60996" 98296: 70026“ 

ry 
0c 61 8I LI 9T Sr TI $1 


a 


ie 

916907 — $9120 TEPLO 61220" 02080" 19880 $2280" 98160" 
29220" £2080" 9689 £9980" 91060" 00760” 61860" 22801" 
12880: 9?160` 96760` ^ $2860" E 18201: STIT солт: 
#68201" »PE901' „ООПТ: «06PTI- 262611 Nan »€6081*  98L8T" 
таст" [728 66IgI- og/gr- 68ЕРТ" 6т0ет" стт" 226891" 
29181" 99761 Savor FUT 21621: 99/81: c6961" 10208 
11661: 67207" 60917" 89907^- —— (08886 88952" ©880@` 081/2" 
FISZ” +967" 0270$ 19918; 19626” 9cgvg- 88198" Seele 
118@}` 18077" LOSS 09295 LIIS 0967" 07118: 00269" 
08899: 71819" T1889" 02869" 17802“ 72812" 0622° LLGEL" 
18826" 62086" ссе6° gosse: 60886" 29076 L646" 19976" 

i 
og 6z 82 25 9c $ yc ez 


^ 


(рәпииод)—т ara v 3 


^ 


2255 


Ф 


1/X FOR, POSITIVE BERNOULLI AND POISSON VARIABLES: Е oi EL. Sure 
TABLE II 
Х|) «e е) 41 2 me/(ata) 


m ` EQ/X|m) m E(1/X|m) 

.01 .99750 1.9 .59351 

.05 * 98754 2.0 -57659 

.10 .97514 2.2 .54417 

15 .96282 2.4 .51361 

.20 * a .95058 2.6 .48488 

.25 .93842 2.8 45792. 

.30. «92636 3.0 .43268 

:35 .91435 32 .40909 

.40 .90244 3.4 .38707 

.45 .89062 3.6 .36654 

.50 .87889 3.8 .34742 

.55. .86725  * 4.0 32963 

.60 .85571 4.2 31308 í 
.65 .84426 4% 29770 

-70 .83292 4.6 28340 ; 
‚75 ‚82166 4.8 ‚27012 

-80 ‚81052 5.0 | -25777 k, 
.85 . 79948 5.5 .23055 

.90 .78854 6.0 .20779 

.95 77771 6.5 .18866. 
1.00 e .76699 7 Que? .17249 
1.10 - T4587 °8 . 14689. 
1.20 .72520 9o e 2.12776 
1.30 -70499 10 .11302 
1.40 .68523 e 12 .09190 
1.50 " e -66594 ° Рала i .07749 
1.60 .64712 16 = .06702 i 
1.70 .62878 18 .05906 ` 7 
1.80 » .61019 а 20 .05280 

hd е 
А ony 
2 
ы Ф 
. e 
е 
г 
о о» ? 


Jnvited, The Department of Statistics of the University of North Carolina 


Carolina. 


STATISTICAL ABSTRACTS. 


а 

This section, an experiment, will present abstracts of articles on statistical 
methods. Such articles appear in numerous journals and it is the aim of this 
section to call attention to them by brief summaries. The object of вас 
abstract is not a critical review of an article but a statement in as nonmathe- 
matical terms as possible of the statistical problem considered, the character 
of the methods used to solve it, and the results obtained. 

Certain journals which normally contain articles of statistical interest will 
be abstracted regularly. These are American Journal of Public Health, i 
Annals of Eugenics, Annals of Mathematical Statistico, Biometrics, Biometrika, — 
Calcutta Statistical Bulletin, Econometrica, Human Biology, Journal of Agri- 
cultural Sciences, Journal of the Royal Statistical Society—Series B, Psycho- 
metrika, Sankya, and Sociometry. Most articles on statistical methods 
appearing in these journals in 1953 or later will be abstracted. Papers on — 
Statistical methods published elsewhere will be included as they come to the ~ 
attention of the Abstracts Editor; readers are invited to submit to him (at 
the address below) abstracts or suggestions of papers for abstracting. 

The usefulness of the section will depend on the thoroughness of coverage, 
the quality, and the style of the abstracts. Criticisms and suggestions are 


has accepted the responsibility for the section and will depend not only on 
the faculty and graduate students of the Institute of Statistics but also on 
correspondents from other universities, government, and industry. 

All commynications concerning this section should be addressed to the 
Abstracts Editor, Professor George E. Nicholson, Jr., Chairman of the De- 
partment of Statistics, University of North Carolina, Chapel Hill, North 


» 2 о 


Abelson, В. P., “А note 


on the Neyman- 
Johnson technique,” il 


ward application of the critical ratio princi- 
Psychometrika, 18 


ple to the difference between predicted cri- 


(1953), 213-18. 


In a multivariate prediction problem, 
for what values of the predictor variables 
will twe soups differ significantly on the 
criterion variable? The region of significance 
of the Neyman-Johnson technique is de- 
fined as the set of points of the predietor 
Space where one group is significantly better 
than the other on the criterion variable. 
‘These latter authors have provided an ana- 
lytic definition of this region for the case of 
three predictor variables. The present au- 
thor generalizes the solution to any number 
of predictors. A ratio approximating the 
generalized region of rignificance is pro- 
posed and this ratio is shown to be asymp- 
totically equivalent to the expression ob- 
tained by Neyman and Johnson. The deri- 
vation given by the author is a straightfor- 


' terion scores for the two groups. The order 


178 


of the approximation of the ratio given here 
to the Neyman-Johnson ratio is that of the 
order with which the Beta distribution is 
approximated by the F distribution. B. J. 
Wiser, University of North Carolina. 


Вагі, 71. S., “The statistical significance 
of odd bits of information,” Biometrika, 
39 (1952), 228-37. 

The author presents a method of pooling 
information from n independent events to 
test a given hypothesis, H. If each event isa 
dichotomy, the total information if în 
= — Z log pi, where р; is the probability of 
the event occurring, given H. The expected 
улие and variance of in are E(i,)— I» 
= — Zi(pilog p;--ailog qi); 01,2 Zipigillog 
(p;/&)]. All logarithms are to the base e. 


STATISTICAL ABSTRACTS 


Н is tested by considering the ratio (i, 
— I3) /ci, 88 a normal deviate. The results 
are extended to a multichotomy. An exam- 
ple is pregented. Some observations are 
made on the use of this method for certain 
types of dependent observations. A brief 
discussion is given on the construction of 
confidence limits when the estimator is the 
ratio of random variates. R. L, ANDERSON, 


North Carolina State College. — .* 


Bliss, C. L, “Fitting the negative binomial 
distribution to biological data”; Fisher, 
R. A., “Note on the efficient fitting of the 
negative binomial,” Biometrics, 9 (1953), 
176-200. s. є 

See Fisher, R. A. 


Cox, D. R., “Estimation by double sam- 
pling,” Biometrika, 39 (1952), 217-27. 

Double sampling methods are developed 
to obtain a large sample estimator, t, of 
some parameter, 0, both for known and 
unknown population variance. The vari- 
ance of ¢ is assumed to be some function of б, 
a(f), given in advance. The results are ap- 
plied to the estimation of normal and bi- 
nomial means, when a(6)=a ог af. Es- 
timation by confidence intervals and a com- 
bined estimation and testing procedure are 
also considered. R. І, Anpenson, North 
Carolina State College. 

e 
Dixon, W. J., “Processing data for outliers,” 
Biometrics, 9 (1953), 74-89. 

The problem considered is how the mean, 
и, and standard deviation, с, should be es- 
timated where N observations from a nor- 
mal population Niu, с) máy actually con- 


Д tain some small unknown proportion y of 


observations from a “contaminating” popu- 
lation with a different standard deviation, 
Nu, Mc), or from one with a different 
mean, N(u4-Ac, o%. The results in the 
Paper are stated for sample sizes N —5 and 
= 15. The numerical results are b: in 
largo Part on experimental sampling. "The 
тог of mean and median (as estimates 
№ and of 5? (estimate of c?) and of s 
ad the range (as estimators of 3 ig inves- 
Еа for various values of А andy. The 
ч ect of the estimators of processing the 
lata for outlying observations using vari- 
m levels of significance is explored. Fomthe 
01е range of A and Y, and for sample sizes 
=5 and N=15, recommendations are 
Made as to the level of significance to use in 
Processing data for outliers, and which es- 
ates to employ. Ілнсоцч Moszs, Stdh- 
ford University, 
э o '& Я 
= B 


Virgo ET 


Draper, J., “Properties of distributions re- 
sulting from certain simple transformations 
of the normal distribution,” Biometrika, 39 
(1952), 290-301. 

Given a non-normal variate, z, to be 
transformed to a normal z with unit vari- 
ance. The following transformations are 
considered: Sz:z=7+é log (r— iz 
=y+ô sinh"! y; Spiz=y-+0 log [y/(1—w) |, 
where y—(z—2)/A. Methods of estimating 
the parameters, у, ô, £, and X are given for 
Sy. Several examples are given, including its 
use for normalizing t and non-central ё. A 
quadrature formula is suggested for estimat- 
ing the moments of the Sp system. The 
author states that the calculation of the 
parameters’ of the Sz, system “presents no . 
difficulty using the first three moments of 
the distribution of д.” R. L, Anpsrson, 
North Carolina State College. 


Fisher, R. A., “Note on the efficient fitting 
of the negative binomial”; Bliss, C. I., 
“Fitting the negative binomial distribution 
to biological data,” Biometrics, 9 (1953), 
176-200. 

The negative binomial distribution is a 
two parameter distribution for a discrete 
random variable which gives the probability 
of z occurrences of an event in a sampling 
unit as  [(k-z—1)0]/ [zi (6— 1)! ][o*/Q.- 
--p)**7]. For limiting values of the parame- 
ters this distiiviltion yields the Poisson or 
the Fisher logarithmic series as special 
cases, The distribution is of wide application 
in biological sciences having been used to 
describe insect populations, distribution of 
bgecterfal clumps, accident rates, eto. "This 
paper discusses vgrious models leading to © 
the negative binomial probability distribu- 
tion, and to various ‘alternative distribu- 
tions more or less related to it. The mean 
of the distribution (which is pk) is efficiently 
estimated by the sample mean 2. The par- 
ameter k has Beet кйш уе ше 
of moments, or from number of zero 
occurrences. Fisher here gives the maxi- 
mum likelihood estimates of the parameters 
together with a convenient arithmetical 
scheme of computation. Tests of fit to the 
model are discussed. Procedures are il- 
lustrated with numerical examples. Lw- 
coun Moses, Stanford University® 


Grundy, P. M., “The fitting of grouped 
truncated and grouped censored normal 
distributions,” Bigmetrika, 39 (1952), 252- 
59. M 

A distribution is said to be censored if the 
frequency of observations in the truncated 


Ge nemen 


180 


region is known although their values are 
unknown. A process involving adjusted 
sample moments, which is used in connec- 
tion with published tables, is shown to be 
equivalent to maximum likelihood estima- 
tion. In the special case when the group in- 
tervals are equal, approximate formulas for 
the adjusted moments become particularly 
simple. The accuracy of the approxima- 
tions is indicated. Information and co- 
variance matrices are studied from the 
standpoint of effects of grouping. A numeri- 
cal example illustrates the principles. T. W. 
Horner, North Carolina State College. 


Gupta, A. K., “Estimation of the mean and 
standard deviation of a normal population 
from a censored sample,” Biometrika, 39 
(1952), 260-73, 

A sample may be censored in two ways: I. 
observations below or above a given point 
may be censored; II. the (n—k) smallest or 
greatest observations out of a sample of 
size n may be censored. The gothor was 
concerned with estimating the mean and 
standard deviation of a normal population 
from a type II censored sample. Tables were 
given which facilitate the computation of 
the maximum likelihood estimates and 
their asymptotic variances and covariances, 
Since the maximum likelihood estimates 
may be biased for small n, the best linear 
unbiased estimate was derived, Coefficients 
for finding the best linear estimate of the 
mean and the standard deviation from cen- 
sored data for wS 10 are given.» An alterna- 
tive unbiased linear” estimate, which has 
great efficiency, was proposed for n ightly 

" larger than 10. Examples illustrate the 
three methods of estimation. T. W. Hon- 
ner, North Carolina, State College. 


Hyrenius, H., “Sampling from bivariate 
non-normal universes by means of com- 
pound normal distributions," Biometrika, 
39 (1952), 238-46, 

The ef*;*,of non-normality on estimates 
of the correlation and regression coef- 
ficients and their variances is studied by 
considering each sample to be from a dif- 
ferent bivariate normal universe, Only 
inequality in the means for the different 
‘universes ig considered in this:paper. R. L, 
Anerson, North Carolina State College. 


Jacob, Walter C., “Split-plot half-plaid 
Squares for irrigation experiments,” Bi- 
ometrics, 9 (1953), 157-75. 

The objectives of the experiment were to 
study the effects of nitrogen, phosphate, and 
potash (ach ats3 levels) on the yield (in 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


pounds) of U.S. No. 1 tubers for three vari- 
eties of potatoes in the presence or absence 
of irrigation; altogether about two and a 
half acres were available in twp blocks of 
equal size. All combinations of varieties, 
fertilizers, and irrigation imply 162 differ- 
ent treatments. Since the main effect of 
irrigation was of little interest (but its 
interactions were of interest) the two blocks 
were each" divided into two plots, one irri- 
gated, and one not. The 81 treatment com- 
binations to be applied to the four split 
plots were arranged’ as а 9X9 quasi-latin 
Square by confoünding portions of second 
and third order interactions with the rows 
and columns of the field. This removal of 
row and column sum of squares resulted in 
about 100% gain in precision. Numerical 
analysis of the data, and the interpretation 
are given. Lixcoun Moses, Stanford Uni- 
versity. 

» 
Johnson, N. L., “Approximations to the 
probability integral of the distribution of 
range," Biometrika, 39 (1952), 417-19. 

Given a random sample of n from F(z). 
Approximate formulas for the probability 
of the range not exceeding w are given. Com- 
parisons are made with exact probabilities 
and significance levels for F(z) normal. 
R. L. Axpznsow, North Carolina State 
College. 


Kaplan, E. L., “Tensor notation and the 
sampling cumulants of k-statistics,” Bio- 
metrika, 39 (1952), 319-23. 

Concise formulas are given for sampling 
from infinite populations, It is shown that 
results for multivariate distributions are 
mild generalizazions of those for univariate 
relations. R, L. Anerson, North Carolina 
State College. 


' Kimball, A. W., “The fitting of multi-hit 


survival curves,” Biometrics, 9,1953), 201- 
11. Ў 

13% а population of organisms be ex- 
posed to a dose of radiation, x. Suppose that 
an organism loses its viability if and only if 
all of n “sensitive units” in the organism are 
inactivated, or hit. Further, assume that 
the probability of any one unit being hit 
is e*, and that hits on various units are 
independent. Then the probability of an 
orgenism losing its viability is (1—e-*#)". 
If we write u; for the logarithm of the pro- 
Portion of organisms not surviving at dose 
z; then E(u;) =n log (1— €;**)). The parame- 
ters to be estimated are k and n. An itera- 
ti?e method of solving the (non-linear) least 
Squares equations is given. The asymptotic 


t 


STATISTICAL ABSTRACTS 


variance-covariance matrix (under normal- 
ity assumptions) is given to permit ap- 
proximate interval estimation. Lrwcorw 
Moszs, Stanford University. 

М 


Kruskal, William H., “On the uniqueness 
of the line of organic correlation,” Bio- 
metrics, 9 (1953), 47-58, 

For some purposes it may be convenient 
to represent a multivariaté distribution by a 
single straight line. This paper considers the 
line passing through the mean of the distri- 
bution and having direction numbers pro- 
portional in absolute value*to the standard 
deviations, and with their signs determined 
by the signé. of the covarignces. This is 
called the line of organic correlation. It is 
shown that this is the unique line based on 

x first and second moments which transforms 
reasonably under omission of coordinates or 
under change of origin or scale, and which 
also provides the proper directions of asso- 
ciation. If the multivariate distribution is 
normal then the line is shown to maximize 
the probability of correct prediction in a 
certain sense. Certain geometrical proper- 
ties of the line are proved (no assumption of 
normality is made), Problems of sampling 
аге not considered. Тлмсошч Moses, Stan- 
ford University. 


Kupperman, M., “On exact grouping cor- 
rections to moments and cumulants,” Bi- 
ometrika, 39 (1952), 429-34. 

Corrections for the cumulants are given 
for the rectangular and triangular distri- 
En corrections for the mean and vari- 

nce are given for the semi-triangular 
(right-half of the triangular), parabolic, 
and exponential distribution. R. L. Ахрев- 
son, North Carolina State College. 


Lancaster, H. sO., “Statistical control of 
Counting experiments,” Biometrika, 39 
(1952), 419-29. — „ " 
Various random experiments were - 
formed to study the adequacy of the xi test 
DAMSnSY of counts from a Poisson dis- 
(3o don With small mean and few counts 
ҮР 5). The author concludes that, “x? is 
А io remain the method of choice in 
В atistical control of counting, regardless of 
size of the sample." R. L. ANDERSON, 
North Carolina State College. 
Lesli à К 
eslie, P. H., “The estimation of population 
cun from data obtained by means 
esting ture recapture method. II. The 
on of tot. ? Biometri 
39 0952, 363-89, 110097 Biometrikg, 


== ° 


181 


Methods are given for estimating the 
total numbers in a population under as- 
sumptions of constant and varying death- 
rate and dilution of the population. The 
death rate is allowed to vary both in time 
and between different groups of animals, 
Preliminary analysis of a set of data is given 
which provides for a test of the absence of 
dilution and for a method of obtaining ap- 
proximate estimates from a long chain of 
samples. L. D. Carvin, North Carolina 
State College. 


Lord, Frederic M., “An application of con- 
fidence intervals and of maximum likelihood 
to the estimation of an examinee's ability,” 
Psychometrika, 18 (1953), 56-57. 

Given the performance of an individual 
on a series of fallible items which sample a 
specified ability, what is the best estimate 
of that individual's “true” ability? The 
author seeks to construct a metric for meas- 
uring the ability underlying a test score 
that will ‘remain invariant under presum- 
ably comparable measures of a given abil- 
ity. The basic parameters in the estimation 
model are: h; measure of item difficulty re- 
lated to the proportion p; of examinees who 
answer the item correctly; c measure of true 
ability; R; biserial correlation between 
answer to item 1 and true ability of exam- 
inees. From these basic parameters, assum- 
ing that the dittibution functions needed 
are normal, the author derives expressions 
for the probability that individual a with 
ability level & will angwer item $ correctly. 
Given these theoretical probabilities, the 
author eobtains maximum likelihood es- 
tifhates for ca and relates these estimates to 
the usual type of tést score. It is shown that 
in the special case whef® all items in a test 
are of equal difficulty and are equally cor- 
related with the ability measured, the 
maximum likelihood estimate is a simple 
function of the usual type of test score. For- 
mulas for the standard error of the maxi- 
mum likelihood estimates are Wijained for 
conditions of the model. Relationships be- 
tween these standard errors and the dis- 
criminating power of the test at various 
ability levels are determined and pro- 
cedures for estimating confidence intervals 
for the true “ability score in temms of the 
test score are given. For tests composed of 
equivalent items, the shortest confidence 
interval for the true score as a function. of 
the test score is obtained for test scores 
slightly above the fialfway point between a 
chance score and a perfect score. B. J. 
Waser, University of North Carolina. 


e 


182 


I 


Maritz, J. S., “Estimation of the correlation 
coefficient in the case of a bivariate normal 
Population when one of the variables is di- 
chotomized,” Psychometrika, 18 (1953), 97— 
110. 

Given a normal bivariate population in 
which one of the variates has been di- 
chotomized and the other variate is con- 
tinuous but restricted in some way. The 
biserial correlation coefficient is no longer a 
consistent estimate of the population cor- 
relation o. An estimate G defined as 
b/(1--52)!*, where b is the estimated re- 
gression coefficient of the continuous variate 
on the dichotomized variate) has been pro- 
posed to handle this latter case. The author 
has adapted the methods of prcbit analysis 
for estimating b for various cases of restric- 
tion in the continuous variate. The deriva- 
tions of these methods are presented. Em- 
pirical sampling experiments from normal 
bivariate populations were carried out to 
obtain information on the sampling dis- 
tribution of the coefficient G. Comparisons 
are made between variances of the probit 
estimates of the regression coefficients and 
those obtained from other estimates. The 
empirical results indicate that G is a more 
efficient estimate of p than is the biserial 
correlation, even in those cases where both 
coefficients are consistent estimates of p. 
B. J. Winer, University of North Caro- 
lina, v» 
McIntyre, G. A., “A method for unbiased 
selective sampling using ranked sets,” 
Australian. Journal? of Agricultural Re- 
search, 3 (1952), 385-90. 

novel method of sampling to estimate 
the mean value of a characteristic is pre- 
sented for the case where measurements of 
the characteristic are expensive but it is 
easy to rank a sample with respect to it. 
For example, it may be easy to rank the 
Plants on a certain area of ground with 
respect to height, weight, or crop yield, but 
consider='4rs more expensive ac to 
measure any of these characteristics. The 
procedure is to form n independent random 
samples of size n each (i.e., draw a random 
sample of n? and divide it at random into n 
subsamples of n each), then get a final 
sample of n by selecting theilargest item 
from the subsample, the second largest 
from the second subsample, and so on down 
to the smallest from the nth subsample. The 
Precision of an arithmetic mean calculated 


_ from this final sample cf n is considerably 


greater than for a simple random sample of 
n. The ratio of the variances for various 
i pa 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


population forms and sample sizes ranges 
Írom 1.33 (negative exponential distribu- 
tion, n= 2) to 3.00 (rectangular distribution, 
n- 5). The author; suggests (n--1)/2 as the 
typical ratio. The paper also discusses esti- 
mation of second and higher population mo- 
ments, use of a priori knowledge about dis- 
tribution form, errors of ranking, and clust- 
ering of sets to simplify ranking. W. A. 
Warum, University of Chicago. 


Moore, P. H., “The estimation of the Pois- 
son parameter from a truncated distribu- 
tion," Biometrika, 39 (1952), 247-51. 

A counter in a physical problem ap- 
peared to stick at certain mimbers when 
counting radioactive particles, e.g., when 
there were more than r emissions in a given 
interval. The sample is thus truncated at a 


certain point, although the number of ob- ’ 


servations beyond this point is known. The 
author proposes a simple estimate of the 
Poisson parameter: «= Xin;/ Zn; where n; is 
the number of intervals with emissions in 
the interval. The total number of observa- 
tions beyond the truncated point is 
N-— Zn;. This estimate is slightly biased 
(of order 1/ N). The variance of z was also 
derived. This method of estimating the 
Poisson parameter was applied to two 
series of data and found to agree favorably 
with the maximum likelihood solutions, 
Т. W. Horner, North Carolina State Col- 
lege. Y Y 


Rudra, A., “Discrimination in time series 
analysis," Biometrika, 39 (1952), 434-39. 

A sequential test procedure is presented 
to decide if a given time-series is of a ran- 
dom, autoregrissive, or moving average 
type. If one of the latter two, it is shown 
to have the lowest possible order. The test 
procedure was applied to 28 series and the 
results compared with the decisions based 
on other techniques. ‘The probabilities of 
making various decisions based on 100 op- 
erajions were computed for two different 
known structures, R. L. Anprrson, North 
Carolina State College. 


Rushton, Š., “On a two-sided sequential t- 
test,” Biometrika, 39 (1952), 302-8, 

A sequential procedure is given to test 
the hypothesis that the mean, u, of a nor- 
ша}, population is zero against the alterna- 
tive that u— 5s, where ô is fixed and с 
must be estimated. The results are extended 
to the problem of testing the difference be- 
tween two means. R. L. Axpznsow, North 
Carolina State College, 


STATISTICAL ABSTRACTS 


_ Skellam, J. G., “Studies in statistical ecol- 


ogy. I. Spatial pattern,” Biometrika, 39 
(1952), 346-62. 

A number of distributions arising in 
quadrant sampling are considered in rela- 
tion to the underlying pattern of organ- 
isms. It is shown that the same distribution 
may arise from several quite distinct 
models, A few ways are briefly suggested, 
by considering additional ‘evidence of a 
different kind, as to how to decide whether 
а given model is appropriate. L. D. Canym 
North Carolina State College. 

E 


. Stevens, W. L., ^Samples with the same 


number in each stratum,” Biometrika, 39 
(1952), 414-17. 

Some results are given on the efficiency 
of constant number vs. proportional sam- 

„ The approximate efficiency is 

E=m'(1—F)/m,—Fm*), where m is the 
first moment and m the second moment ® 
(about zero) of the frequency distribution 
of the number of units per stratum, and Р 
is the sampling fraction. R. L. Anperson 
North Carolina State College. 


Whittle, P., “Tests of fit in time series,” 
Biometrika, 39 (1952), 309-18. 


A general least squares test of fit of time 
Series models is presented. The statistic is 
shown to be asymptotically distributed as 
X; in the limit, the statistic is the ratio 
of the geometric and arithmetic means of 
the residual variates’ periodogram. The 
examples included are concerned. mostly 
with autoregressive schemes, but it is em- 
Dhasized that the tests are appropriate for 
other methods of graduation. In examplés, 
the test amounts to accepting the hy- 
Pothesis giving the best fit. D. Gossuess 
North Carolina State College. 


e 

Williams, E. J., “Use of scores for the 

ауар of as§otiatidh in contingency 

ables,” Biometrika, 39 (1952), 274-89. 
Given а contingency table with cell fre® 

quencies nj, (—1,--*, p and j21,*** , 

020) where У, — n... If fixed scores 

G 


183 


are available for both classifications, a sim- — 
ple formula is given to estimate the correla- 
tion coefficient, r, to be used as a measure 
of association. The ratio (n, „—2)r?/(1— r?) is _ 
approximately distributed as F(1, n,,—2), _ 
where the numbers in the parentheses refer 
to numerator and denominator degrees of 
freedom of the variance ratio distribution, 
If only the p row scores are fixed, a method 
is given to estimate the q column scores, 80 
as to maximize the multiple correlation, R. 
In this case with ratio (n,,—9) R'(g— 1)(1 
— R?) is approximately distributed as 
F(g—1, n..—q). When neither set of scores 
is fixed, Ё and the scores are estimated by 

a canonical analysis. The estimate of R? is 
the largest latent root of a matrix whose ele- 
ments are simple functions of the cell and _ 
marginal frequencies. Tests of significance 
for a proposed set of scores are presented 
for g=2 and 3. An example is presented 
with p=4 and g=3 and 4, Some indications 
are given in an appendix of the adequacy of 
the approximations used. В. L. ÁwpxnRsoN 
North Carolin& State College. 


Youden, W. J., and Connor, W. 
chain block design," Biometrics, 
127-40. 

Most experimental designs used in agri- 
culture, biology, psychology, etc., involve 
a fairly high degree of replication, which is” 
appropriate because of the magnitudes of 
variability encowsvéred. Physical measure- 
ments (e.g., spectroscopic determinations) 
are ordinarily made with far greater preci- 
sion and a high degree of replication repre- - 
sents a waste of resources. The chain block 
design is £ very elastic arrangement calling 
for two determinations (each in a different N 
block) on some treatments, and only one on 
the others, This results in an over all degree 
of replication which lies between one and 
t*vo. Methods for layout and analysis are 
given; practical considerations influencing 
choice of layout are considered; a numerical 
example (42 “treatments,” of whiçh twelve 
are repeated, in three blocks) is workéd out 


“The 
(1953), 


in detail. Lcoun Moses, Stanford Uni- 
versity. z 
= e 
e e 
е 
a [i m 
xO be 
^ € So s 
e r 


BOOK REVIEWS 


Cyclical Movements in the Balance of Payments. Tse Chun Chang. Cambridge 
(England): Cambridge University Press, 1951. Pp. x, 224. $3.75. 


See the article by Solomon Fabricant, pp. 79-87 in this issue. 


Demand Analysis: A Study in Econometrics. Herman Wold in association with 
Lars Jureén. New York: John Wiley and Sons; Stockholm: Almquist and 
Wiksell, 1953. Pp. xvi, 358. $7.00. ð 

See the article by H. S. Houthakker, pp. 88-96 in this issuo. 


2 


Facts from Figures. M. J. Moroney. Baltimore, Maryland; Penguin Books, 
Inc., 1953. Pp. 472. $0.85. 


mrs constitutes a minor revision in &ontent, but a major downward revi- 

sion in price, of the volume reviewed by M. A. Girshick in last Septem- 
ber's issue of this Journdl (Vol. 48 (1953), 645-47). The changes which have 
been made in content are described by the following sentence from the 
Preface: “The contents remain almost unchanged, except for the latter part 
of Chaper [sic] II which I have revised to include a new approach to modified 
limit control charts.” Changes which have not been made are described in 
the following two sentences: “I am sorry still to remain persona non grata to 
the index number mand the fortune tellers, but there it is. I give way to 
none in my admiration for the theory (may its shadow never be less!), but 
when it comes to a great deal of the practice I simply cannot help chuckling.” 

W.A.W. 


» уу арт QUE 


Тһе Application of Operations Research to Industry. Ellis А. Johnson (Director, 
Operations Research Office, Johns Hopkins University, Chevy Chase, Mary- 
land). Published by the author, 1953,,Pp. 61. Paper. Free of charge. 


А. W. Swan, Courtaulds, Lid., Coventry, Eugland: 


qu finds this an exasperating.publication because parts of it are 
80 extraordinarily good and provocative, while parts seem to wander off 
into complexities that have little useful interest to the worker in Operations 
Research. R 

The introduction is gne of the best parts of the book, with a penetrating 
analysis of the basic thinking in O.R. The author points out that the methods 
of Operations Research are closer to those of the basic sciences than to those 
of engineering, but that the techniques and methodology have much in 
common with those of industrial engineers and management consultants. 
He goes on to say that O.R. has been concerned from the start with the 
decision-making system in general, an with the problem of providing indi- 


К 3 E es 


ek 


| 


BOOK REVIEWS 185 


vidual executives with management advice. He considers that one of the 
main contributions to Operations Research is the use of the team and that 
it has, more consciously than industrial engineering, developed action- 
models based on fundamental theory. He also feels that it has relied much 
more upon complex mathematical concepts and techniques and has realised 
more fully than industrial engineers the necessity of estimating the uncer- 
tainty of its predictions. “Operations Research places a particular demand 
on the analyst’s ability to translate his findings into language which simply 
and clearly sets forth the values, effectiveness and costs of a set of proposed 
courses of action." ° , 

After this stimulating introduction the author has a chapter, which the 
writer finds baffling, on the “Relation of the Operations Analyst to the 
Executive," with a set of diagrams illustrating thé interactions of various 
departments and factors. The ordinary O.R. worker would consider it a 
waste of time to set these down. 

The following sections, “The Operations Checklist for Solving Action Prob- 
lems” and “Planning Detailed Operations” set out principles in diagrammatic 
form, and these diagrams are presumably correctebut they are, from the 
analyst’s point of view, what “Punch” calls “glimpses of the obvious,” since 
they are so much taken for granted by the O.R. analyst and industrial en- 
gineer that they have become sub-conscious and do not need to be shown as 
complex diagrams. 

Chapter ITI gives a brief description of “Some Selected Analytical Tools.” 
Unfortunately, statistical method, certainly the most useful and widely ap- 
plied O.R. tool today, is dismissed in a brief paragraph as being well outlined 
in Morse and Kimball—an opinion which might not be shared by, everyone. 
Statistical method is not the only new tool used by the ©.R. worker to 
distinguish him from the industrial engineer wko preceded him, but it can 
be stated with some confidence thát/a very large proportjon of present day 
industrial О.В. work is based upon the statistical approach; the whole 
gamut of statistical method is used and the statistician is an essential mem- 
ber of any worthwhile Operations Reseatch Department. At this point, any 
publication om Opexations Researth must necessarily devote а good deal of 
attention to the statistical approach and the methods of applying statistical 
thinking to industrial problems. ® x 

| The next tool mentioned is Symbolic Logic, but unfortunately the example 
given appears to be one of verysfew in which this tool has been applied, and a 
description of the same example is given in Factory, October, 1953. We 
then proceed to the “Theory of Value” and while йз is doubtless aguseful 
method, the writer is not aware of any example in which it has been applied. 
The following two sections are, “Queueing Theory” and “Stochastic 
Processes,” both of which are, in effect, subsections of statistical method. 
The “Theory of Games,” the next section, is, in the opinion of a large numbér 
of О.к. workers, a highly importan? potential tool, to be used mainly in 
conjunction with what is known ав" пеат programming, and thefe is a con- 


— ө 5 2 f pus 


€ 


186 ‘AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


siderable literature on the subject. The practical use of this kind of thinking 
has, however, not yet proceeded very far, and this again is a potential rather 
than actual method. Е 

We then have Chapter IV which gives examples. Unfortunately we start 
with some “polishing of medals” with two examples taken from wartime 
practice. The next example is taken from agriculture—not perhaps the most 
useful for the industrialist. There is also included a brief reference to the 
massive problem relating to the standard of living in Puerto Rico, interesting, 
but not of much immediate use to the potential O.R. worker. The author 
then includes a summary of an excellent paper by John F. Magee of Arthur D. 


. Little, Inc., “The Effect of Promotional Effort on Sales"—the only purely 


industrial example. 7 d 

In the final brief chapter on the difficulties met in Operations Research the 
author returns to his penetrating analysis and the result is excellent. The 
first difficulty mentioned is that of communication between the scientist and 
the executive. The O.R. analyst has to learn the executive's language and 
how to translate into that language. “An operations research study becomes 
effective in proportion t» the amount of effort spent in communicating the 
effects of the research and clearing up with the executive on a personal basis 
all the questions involving the validity of the study. Since very few analysts 
are adept at, or recognise the need for such ability on their part, the results 
of much good O.R. are never used.” The author lists other difficulties in 
which he includes the extreme difficulty in getting highly skilled specialists 
from very diverse and often antagonistic disciplines to work well as a closely 
integrated team. This difficulty can, however, be readily overcome by adopt- 
ing for every job the simple plan of having the O.R. analyst in charge form a 
team, consisting of the appropriate members of his own staff and the tech- 
nicians and other specialists; whose knowledge will be most valuable, on the 
basis that the team will work together for a common aim, and that each 
member of thst team will stand to gain personally in kudos. 

In connection with this review it is fair to point out that the subject of 
Operations Research is a thorny ойе. There is, today, no cómpletely satis- 
factory book on the subject and anyone Who has the courage to tackle the — 


` subject, as Ellis Johnson has done deserves praise, especially if he has suc- 


ceeded in giving useful suggestions, as is certainly the case in this book. 


The Revision of the Rapid Transit Fare Structure of the City of New York. 


. William 8. Vickrey. New York: Mayor’s Committee on Management Survey of 


the Cisy, 1952. Pp. xii, 156. Paper. 


Wixtram В. BuckrAND, Lontlon Transport Executive! 


"s report is the third of the Technical Monographs from the Finance - 
Project of the Mayor's Committee on Management Survey of the City of 


1 The views expressed are purely those of the reviewer and are not necessarily the views of the 
organization. 7 


> % 2 


BOOK REVIEWS 2 ОВТ 


New York. Its author, one of the Staff Members of the Finance Project, is 
| an Associate Professor of Economies at Columbia University, £8484 82 
| After a short introduction on the background of the problem, the report. 

| puts forward a marginal costs basis for deciding upon a fare structure to pro- 
| mote the optimum utilization of the particular transit facilities under dis- 
| cussion, This theme ig further developed in Chapter 4 (and its Appendix) ` 


* 


although it is preceded by a chapter on “Patterns of Traffio"—which is 
largely an account of trying ‘to make bricks without (statistical) straw — 
and followed by a déve]opment of this traffic theme on the lines of the diffi- - 
culties of adjusting services to conform to the traffic pattern. The mechanics _ 
| of fare scheines and collestion devices are dealt with in Chapters 6, 8 and 9 
while the intervening Chapter 7 again develops tlfe traffic theme in relation 
to fare changes. Finally there are two short chapters on considerations of | 
‘equity and what may be called general Social Planning. T 
The economic theme of the report may perhaps be illustrated by two ex- 
tracts from pages 4 and 5: 
Since fares must necessarily be set in advance®nd announced to potential 
passengers if they are to have the proper effect upon the passenger's decision 
to travel or not to travel at a given time and place.... i 


Only if the fare fully reflects at all times and between all points the costs 
of carrying additional passengers will the fare structure achieve an efficient 
utilization of the facilities... . : 


The development of the principle ultimately produces’a set of proposals on 
the desirable fare structure for which it would Be difficult to carry out the 
intentions expressed in the first of these extracts, and which would be be- 
wildering to the travelling public if they could. 

The important point of the non-monetary costs of travel in the form of а 
fatigue and time costs i$ brought out very well but thé effect is somewhat 
marred when it is recorded that: ; : 


the passertger would be willing, in ofder to avoid the inconvenience, to pay 
an amount that would cover ihe additional money costs (of providing addi- 
tional.services) . . . . 


[c 
The idea of passengers being willing to pay more money is quixotic since 
these hedonie cost components are capable of infinite variation as between 
individuals as well as in time'ànd space. Given a fare structure and a pattern 
Of services, the traveller will tend to minimize his total costs—monetary and 
hon-monetary—according to the needs of the moment and changesen these 
heeds are, at this level of detail, part of the random fluctuations which are 
Present in all patterns of travel. Therefore it is suggested that the fare struc- 
ture which will help the traveller most in this task is one fpr which the money 
cost is virtually the same for comparable distances at any time of the day. 
Tn this way the passenger has the gfeatest opportunity to work out his own 
Salvation and the traffic flows uncónstrained by differential fares. It must be 
Said, however, that this kind of scheme requires a degree of erus of em 


£ 


a 


188 ‚ AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


fare structures for all the major forms of public passenger transport which 
may not at present be available in New York. It seems to this reviewer that 
the disadvantage with schemes for differential fares according to time and 
place—such as are largely advocated in this report—is that as soon as a 
fare structure is promulgated its operation tends to change the flows of 
traffic upon which it is based and it then becomes necessary to change the 
fare structure if the optimum position is to be maintained. Thus, the fare 
to be paid for a given journey is a matter of some doubt for the travelling 
public which must be psychologically undesirable. | ° 

To deal with some rather more practical issues: on page 10 there is an 
equation expressing operating expenses in terms of,certain physical operating 
characteristics. The equation appears to be the usual form of multiple re- 
gression equation assuming no interaction between the various characteris- 
ties but it has, in fact, been developed by the non-statistical process of allo- 
cating the operating expenses to the various physical characteristics of 
operation. For example, the salaries of motormen are allocated to the char- 
acteristic of train miles. This may have been the only reasonable method 
available but it would have been desirable to have given more space to the 
interpretation in order to avoid possible confusion. In Chapter 3 there does 
not appear to be any reference to the survey work on travel in New York 
which has been done by various organizations in connection with trans- 
portation advertising and which should have been useful for this investiga- 
tion. In connection with the availability of data for this study it would have 
been useful to have put forward some recommendations on how the various 
deficiencies might be filled. The detail in Chapter 9 of the various collection 
devices associated with ‘the different kinds of fare structures discussed in 
this report give the reviewer, a distinctly unhappy feeling as to their prac- 
ticability both from the engineerirtg and е commercial point of view. There 
appear to be far too many things capable of going wrong. These vary from 
the station staff not changing something vital at the right time, through the 
peak-period passenger not having ready the right coins, to the mechanical 
and electrical devices necessary to display illuminated sigys for the currently 
required fare (with gongs to signal the alterations) and the amount of change 

_ the passénger may expect to receive as a ‘tesult of his not having the correct 
coins available. 

The considerations of equity and social planning at the end of the report 
begin to place the whole subject into a perspective in which passenger trans- 
port takes on the aspect of something which is very much concerned with 
the human business of living, working, and playing. Transport is only a 
means to an end and that end may well'be the Optimum utilization of the 
human and material resources of a given area, say New York. Surely, since 
this investigation was worthwhile, it should have been approached in the 
spirit of an operations research project, for its solution demands the welding 
of economic to transport operation and engineering with statistical meth- 


_ ods E e flux. As it is, an undue concentration of attack on the first of 
2. b % p 


BOOK REVIEWS 189 


these has produced a report which will probably make the transport operator 
and his engineering colleagues shudder: the statistician will merely lament 
once more that most of the data necessary for the problem were not avail- 
able. 


Measuring Your Publio Relations, a guide to research problems, methods and 
findings. Herman D. Stein. New York: National Publicity Council for Health 
and Welfare Services, Inc., 1952. Pp. 48. $1.25. 


Marw Jamopa, New York University 


pe aim qf this booklet, is to provide the professional personnel in health 
and welfare organizations with a balanced view of what research can do 
for their agencies and to describe different research procedures adequate for 
"different types of problems. Mr. Stein does not want to “sell” research; 
rather he wishes to enable persons concerned with the practical work of 
health and welfare agencies to decide for themselves what research they 
need. E 

With this aim in mind he diseusses: the nature of the problems which 
arise in the public relations of voluntary agencies; informal research tech- 
niques (they might better be called fact-finding techniques) which an agency 
can apply largely without expert help; pre-testing of written material; the 
value and limitations of public opinion polls; communications research, etc. 

The presentation of these matters is straightforward without being over- 
simple. There are many examples used in the text*iv"illustrate a point. A 
peal list of selected references on a more technidal level completes the pub- 
lication. e OS 

It lies in the nature of such a publication that it offers no new ideas. The 
Tesearch person who is about to embark for the first time on a study for or , 
of an action agency might find it helpful, however, to glance through this 
booklet. Whether or not it fulfills its aim with its actual target audience is 
in itself a question for research in public relations, 


* * 


The WOI-TV Audience. Mimeo Series No. 1. Ames, Iowa: StatisticalLabora- 
tory, Iowa State College, 1952. Pp. 125. Paper. 


Lester R. FRANKEL, Alfred Polite Research, Inc. 


T WOI-TV Audience is a statistical report describing the size and char- 
acteristics of the television audience located within a 50 mile radius of 
Ames, Iowa. The data are based upon a sample survey, and were obtained 
for the purpose of establishing bench mark data against which future sur- 
veys may be compared. The text material in this report is a description of 
the methods used to obtain the data. 

‚ The survey design, the questionnaire, the sample plan, and the field opera- 
tions were the responsibility of the Statistical Laboratory, су Col- 

m 


б 
е ге 


€ 


190 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


lege and, as is to be expected, the techniques used appear to be superior to 
those employed in the usual stereotype commercial television audience 
study. 

The.sampling procedure was designed in such a manner as to make full use 
of the resources that were available. The sample was of the multiphase, 
single-stage type. After a sample of households had been selected, household 
characteristics data relating to the ownership of TV sets was obtained. In 
the second phase, additional data were obtained from all television house- 
holds, 25 per cent of the non-television households and.50 per cent of all the 
adults in these two groups of households. ; 

Single-stage sampling was employed in the selection of the households to 
be included in the sample. A sample of 400 segments was selected to repre- 
sent the survey area, and all households within each of the designated seg- 
ments (approximately 6) were included in the first phase sample. Of par 
ticular interest to the practicing sampling statistician is the discussion of 
the uses of the Master Sample Maps, the City Directory, photocounting, 
and the methods of cruising for the determination of segment sizes. 

Aside from the problent of sample design (which is merely a blueprint) the 
problefns of the execution of the survey are discussed in detail. The training 
was particularly important in view of the fact that the second phase of 
sampling was accomplished by the interviewers at the field level. In addition, 
since some of the questions on the questionnaire dealt with the respondent’s 
activity on the day before the interview, it was necessary to spread the inter- 
viewing equally overe} seven days of the week. 

The format and organization of the report as well as the presence of some 
minor aritltmetical inconsistencies tend to detract from the impressiveness 
of the study. However, it is clear from the description that the design was 
not intended to produce a quility impression but a quality study in the most 
efficient manner possible. For example, ten years’ ago it’ was more or less 
assumed that the steps in selecting a sample should follow the time consum- 
Ing sequence of primary unit selection, segment selection, listing of addresses 
by interviewers, final household selection in the central office from the often 


shaky prelistings, and interviewers revisiting the selected locations to inter- 


view. E»the study of the WOI-TV audience there was an efficient utilization 
of man hours. Real inventiveness based on genuine statistical knowledge 
obviously played a role in finding an additional method by which the house- 
hold and individual selection was accomplished in a single field operation. 


Social nd Psychological Factors Affecting Fertility. Volume Three. P. K. 


NM and Clyde V. Kiser. New York: Milbank Memorial Fund, 1952. $1.00. 
aper. ; i ES 


i E. Lewis-FawrNG, Welsh National School of Medicine 


OURTEEN years ago, Raymond Реай (1939) in his book “The Natural 
History. df Population" calculated pregnancy and live birth rates for 
wa э 


zw 3 : == 
\ў 


BOOK REVIEWS 2191 
yarious groups of women, differentiating those who had, and those who had 


not used contraceptive methods. He discussed among other things the effects _ 


of contraceptive efforts on the pregnancy rates in relation to economic status, 
education and religion. 

Ten years later, the Royal Commission on Population (1949) compared 
pregnancy rates for periods of reproduction in which (a) no birth control 


was used, (b) contraception was abandoned, (c) contraception was being | 


used. They also compared the average desired and actual size of the family, 
and the number of'unwanted children for groups of women classified ac- 


' cording to the degree.of success attained in planning and spacing a family by 


contraceptive methods. , 


The collection of seven papers here reviewed goes still more deeply into 


the subject and inquires what social and psychological factors contribute 
"towards successful family planning. Is it those who feel most economically 
insecure who successfully restrict the size of their families? Is it those who 
plan other aspects of their lives? Is it those with poor health or those with 
a feeling of personal inadequacy? These are amdng the problems which con- 
stitute the subject matter of these papers. 

It will be noted that in both the earlier publications the indices used were 
quantitative—or, if not, were definitely factual—grade of education reached 
or religious denomination—and that to such data statistical methods and 
reasoning could legitimately be applied. The reader’s reactions to the volume 
under review must depend on whether or not he accepts that indices of such 
nebulous qualities as “a feeling of economic insecurity” or “a tendency to 
plan in general” or “a feeling of personal inadequacy”—indices to. which 
statistical methods and reasoning have here also been applicd—have been ог 
indeed can be satisfactorily constructed. ^ , 

One example will illustrate thé dubiety whieh the reader should not fail 
to feel. Replies by 1444 wives to the question “Do you plan your buying to 
take advantage of sales?” were tabulated as below, showing the distribution 
according to success in family planning (“A” being the most successful) of 
the group of, women giving each*specific answer. 

On this, the authors comment: “One group of wives answered “Very 

e 


Percentage distribution by fertility 


Plan to buy ы planning status 
at sales No. EEBS А a RO улг 
x A т: и €, ер 
Еа nu Ын е ы ти TUAE. ans Уз 
Very often 47^ s 31 13 31 24 
Often  . 481 25 17 e 30 28 
Sometimes © 419 28 13 35 24 € 
Seldom 36 331 11 19 39 
Very seldom 21 52 


192 . AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


Often.” Among these, 44% were in the effective fertility planning categories 
(A and B). Only 28% of the wives answering “Very, seldom” were in the 
effective fertility-planning groups.” 

Can answers such as these be used as indices of “a tendency to plan in 
general”? If one has been reared in the belief that most of the goods offered 
in sales are manufactured especially for sales and are generally of an inferior 
quality, one plans not to buy at sales. Nevertheless, if there is a large family, 
the wife may be forced to buy in the cheapest market. Does this make her a 
(good) planner? In fairness, it must be stressed that in the instance cited, 
the conclusions are based on the replies to many more questions of the same 
type than the one used as illustration. The volume as a whole must contain 
some hundreds of tables like the one quoted. There is no lack of data and 
the arguments from interpretation of the figures, although involved, seem 
reasonable, always provided that the indices used do really measure what 
they are supposed to be measuring. On this fundamental matter some scepti- 
cism is not unjustified. 

In one section, scores age allocated to sets of replies to different questions, 
and Pearsonian correlation coefficients calculated as a measure of the inter- 
correlation between the different indices used, in spite of there being no evi- 
dence that the intervals represented by differences between the scores were 
uniform. Is the difference between “Very often" and “Often” equivalent 
to the difference between “Often” and “Sometimes”? 

Qualified statisticians reading these papers can safely be left to utilize 
their professional knowledge and experience in assessing the statistical 
validity of, the conclusions reached but there is a real danger that the 
statistically unsephisticated (who comprise possibly the majority of workers 
in this field) will be overawed, by the facade of diagrams and statistical tables 
into accepting the conclusions as authoritative and final. To such, two things 
need pointing gut: first, that the 1444 couples whose reproductive and contra- 
ceptive histories and personal opinions as to the impact of their economic 
circumstances and psychological characteristics on their desire for children, 
form the data common to all seven papers; although a homogeneous group 
and therefore presenting certain advantages for studies of this type, are far 
from being representative of any past or present community in the U.S.A.; 
second, that the answers recorded on the questionnaires were given as long 
ago as 1941-42 by couples married in the years 1927-29 and reared in the 
traditions of 1900-10. Strictly, then, the replies should be interpreted in the 
light ofthe economie circumstances that prevailed when they were building 
their families—mainly in the era of depression of the late 20’s and early 
30’s—rather than in the light of those of the 1950’s. That the authors have 
records enabling them to do this, or indeed that they realize the essentiality 


o attempting it, is not clear. Many aspects of life changed even between 
1941 and 1951. 5 


* 


a 


BOOK REVIEWS 193 


Statisticians, especially those interested in the field of human fertility, 
would be rendering a service by reading and commenting on this publication. 


REFERENCES 


Raymond Pearl, The Natural History of Population, Oxford University Press, , 
London, 1939. é 7 A 

Papers of the Royal Commission on Population, Vol. 1, Family Limitation and 
its Influence on Human Fertility, H.M. Stationery Office, London, 1949. 


7 
© 


Community Wage Patterns. Prank C. Pierson. Berkeley and Los Angeles: Uni- 
versity of California Press, 4953. Pp. xvii, 213. $3.75. 
* 


MirrcnELL О. Locks, University of Oklahoma 


T volume is an attempt to determine the nature and causes of relation- 
ships between post-World War, I wage developments in Los Angeles 
County, California and those in other large metropolitan areas. The book has 
chapters on each of the following topics: Pre-1940=Wage Levels; 1940-1949 
Wage Levels; Local Influences on Wage Levels; Industry Influences on Wage 
Levels; Relationship Between Employment and Wages; Relationships Be- 
tween Investment, Productivity, and Wages; and The Influence of Unions on 
Local Wages. As can be seen from the titles, each chapter covers а topic во 
important that it could be a separate study in itself. 

The author has used a great wealth of material assembled from many 
different sources, principally other books about Califórnia. However, with в 
study having such broad scope confined to only 164 printed pages (besides 
Appendix Tables), it is not surprising that at times the suthor seems to 
wander aimlessly in a forest of secondary statisties. The reviewer received 
the impression that the author wás not sure of what his data showed him, 
and therefore had to state many of his conclusions without strong conviction. 

An example of this may be found in his analysis of the relationship between 
Employment and Wages. He computed» certain rank correlation coefficients 
between employment and wages or the period from 1929 to 1939, and also 
for the period from 1940 to 1949. His average rank correlation coefficient for 
six cities for the earlier period was statistically significant (although the indi- 
vidual rank correlation coefficients for each of the six different cities for that 
Period were not), while the ayerage rank correlation coefficient (as well as 
the six individual rank correlation coefficients) for the later period were not 
Significant, On the basis of this very meagre evidence, the author makes the 
following statement without further explanation: 


This suggests that industry employment and wage levels move together when 
there is a large amount of unemployment, but that these two variables bar 


no consistent ‘relationship to each other when labor market conditions are 
relatively tight. (p. 116) 8 ; 


: З 
eee А 
. <= 
> є qm 


104 . AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


This statement contradicts the widely-held view that the elasticity of 
labor supply with respect to wage rates is high in depression. For that 
reason, the author missed an opportunity to make a significant contribution 
to wage theory by not further explaining the reasons for this observation, 

Another example of inadequate explanation occurs on page 61: 

.„ . hourly earnings in Los Angeles manufacturing rose less rapidly than 
in the United States between 1945 and 1949 (30 per cent as against 37 
per cent). 


However, data obtained from Appendix Table 1 show that Los Angeles re- 
tained its ranking in average manufacturing wage levels during the period 
Írom 1945 to 1948. (Comments concerning the validity of some of the data 
in Appendix Table I will be found later in this review.) These data purport to 
show that Los Angeles was seventh in average manufacturing industry wage 
levels in a group of 20 large cities in both April, 1945 and April, 1948. Al- 
though this discrepancy in findings made at two different points in the book 
is а relatively minor one, the author should have furnished an explanatory 
note. D 

The overall statistical facets of this book inspire the following comments: 

(1) Applications of the Analysis of Variance. In Chapters VI and VII the 
author performed analyses of variance on eleven different, manufacturing 
industries in six different cities with respect to percentage changes in em- 
ployment, average annual earnings per worker, and value added per worker 

from 1929 to 1939. Separate analyses were performed for each of these three 
characteristics using the ^F" test, and the results were reported in Tables 
11 and 20.°However, examination of Tables 8 and 19 show that the same 
data were ranked within communities for purposes of computing rank cor- 
relation coefficients. Thus the data in those tables were already in a form 
Such that with a negligible amount of additional éalculation, a “Friedman” 
rank analysis бї variance could have been performed.! The reviewer submits 
that the latter method would be preferable because of the normality assump- 
tion implicit in the “F” test. «s 

(2) Appendix Table I. There is reason to doubt the validity of the data in 
the last*column of Appendix Table I. Since much of the analysis of wage 
movements in Chapters II and III is based on that table, there should be 
further explanation of how those data were obtained. The reasons are: 

(a) The source given for the data in that table is a book published in 
ic Де this fourth column gives manufacturing wage indices for April, 

(b) The data in the first three columis of that table give, respectively, 
manufacturing wage indices for 20 different cities for April, 1941; April, 1943; 


> : 
1 Milton Friedman, "The use of ranks to avoid the assumption of normality implicit in the analysis 
of variance,” Journal of the American Statistical Association, 32 (1937), 675-701. 
? Ruth Me Farlane, Wage Rate Differentials: Comparative Data for Los Angeles and Other Urban 
Areas. (Los, Mgelea, 1946.) The Haynes Foundation. 


> | ° == 


TUM 25 


BOOK REVIEWS 195 


and April, 1945, each adjusted to an average index of 100 for its date. How- 
ever, column 4 has an average index of 144 for the 20 cities. Thus it would 
appear that a different basis was used for calculation of the April, 1948 
indices than had been used for the earlier three dates. : 
The reviewer believes that despite indicated shortcomings of the book, the 
author made a contribution to the study of wage patterns, In large measure, 
the scope of his analysis is linfited by the fact that he had to use secondary 
data. However, there is need for more studies of this type using more primary 
data than are now ayailable for this purpose. ї 


о 


* 
° 


Punch-card Methods. Harry Р. Hartkemeier. Dubuque, Iowa: Wm. C. Brown 
Company, 1952. Pp. xvii, 360. $5.00. Paper. 


P. C. Hamer, University of Wisconsin E 


qe subtitle of this book is “How to Use and Operate Punching, Sorting; 
Electronic Statistical, Tabulating, and Accowfnting Machines Including 
Types 24, 26, 75, 80, 82, 101, 402, 403, and 407.” All the machines discussed 
are the IBM models. Basically the book is an illustrated reassembly of ma- 
terials contained in the manuals provided by the IBM Corporation free of 
charge. For purposes of instruction this arrangement may be better than 
the individual manuals. 

The book is slanted toward commerce and accounting students, the prob- 
lems being primarily in*those fields of interest. The most difficult mathe- 
matical problem dealt with seems to be progressive digiting. Since there is a 
decided lack of good expository material on punch card«methods more 
books in the field are indicated. However, the reviewer feels that the author 
ша шын some of the basic punch-card equipment and methods in this 

ook, : : : 

` For example, the summary punches, the collators, reproducers, and the . 
calculating purches (602A and 604) are not discussed although each is of 
great usefulness in statistical and commercial work. The author gives the 
impression that all machines not discussed are no longer being manufac- 
tured. This is by no means the casé; all the additional machines mentioned 
above are still in production. 

Since the book neglects so many virtually essential machines for scientific 
and accounting practice it cannot be recommended as a text without exten- 
Sive supplementary materials. A e 

The book is reproduced by photo-offset and has a soft paper cover and a 
Permanent loose-leaf bindirtg. The text is well written in view of the neces- 
sarily segmented character of such manuals. 


196 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


ssociated Measurements. M. H. Quenouille. New York: Academic Press Inc., 
1952. Pp. x, 242. 


Іѕлрове BLUMEN, Cornell University 


Hz is a handbook on correlation, regression, and related topics which 
many statisticians will find useful. The formal layout of analyses, hints 
on practical pitfalls, and the details of manipulative technique are illustrated 
through the copious use of numerical examples. 

Unfortunately, the very nature of such a handbook seems to have forced 
the omission of so much in the way of basic ideas that it cannot be recom- 
mended to non-specialists. The statistician will want the book in order to 
have details readily available. The more general reader, however, will be 
bothered by such fundamental problems as the rationale for the choice be- 
tween various methods proposed. For him there are not always clear answers. 

The book is divided into four parts. The first forty-eight pages contain 
those “quick and dirty” methods which the author finds most useful and 
which require only plotted data. A section on similar numerical methods is 
included later under the heading of grouping observations. Included in these 
sections are graphical methods for bivariate and multiple correlation prob- 
lems and curvilinear regression, a few non-parametric tests, and some de- 
vices adapted to situations where data is easy to obtain, great accuracy not 
wanted, and simple computation important. Omissions, due apparently 
both to the organization of the book and to the desire to keep it down to 
manageable size, include most non-parametric procedures—e.g., the rank 
correlation coefficient*(which name the author chooses to bestow on Ken- 
dall's tau) and runs tests. Biserial, tetrachorie, and related correlation pro- 
cedures are not mentiohed. 

The second part covers the conventional topics: bivariate correlation and 
regression, multiple and partial correlatien, and curvilinearity. This is well 
done, although one might quibble that the author's treatment of such prob- 
lems as identifiability of parameters is not strictly accurate, the conditions 
he gives being sufficient rather Шат» necessary. The problem, of using corre- 
lated variates for screening and selection, as in personnel testing and genetics, 
is not treated. à 

The third part includes grouping, saalysis of covariance, and general 
pointers on the organization of investigations, Readers who have not been 
exposed to covariance before should be warned that this exposition is not 
particularly lucid. 

The,last sixty or so-pages deal with a variety of problems. The section 
on multivariate analysis is remarkably well done for so condensed a non- 
mathematical discussion. The problems'of time» series, not being easily re- 
ducible to elementary terms, are somewhat less satisfactorily treated but 
this section will nevertheless be quite useful. There is also:a section devoted 
to a variety of hints and comments. » 

Most 508 more desirable tables are included аз is a fairly extensive, 

v 5» 


- y А : 


— ae 


197 


but not selected, bibliography. Teachers who use this as a laboratory manual 
their courses, for which the book is well adapted, may complain of a lack 
of problems for students to work out for themselves. It would also be desir- 
able if in future editions the conventional names for tests and procedures 
used. = 
"From an over-all point of view, the reviewer was disturbed by the lack of 
"discussion of alternate hypotheses, of the power of various tests proposed 
| and of the relative quality of various estimation procedures. Why reject 
4 me observations in one case and not in another? Why use a graphical 
method instead of the more common estimates? Why choose one non- 
Lr parametric, device over another? Surely more thoughtful answers could haye 
been provided by so competent an author. 


Iypothesis Testing in Time Series Analysis. Peler Whittle. New York: Hafner 
lishing Company, 1951. Pp. 120.,$3.50. Paper. 


Јонм GURLAND, Тоша State, College 


THE subject of time series analysis, where the paucity of suitable sta- 
"E. tistical tests is conspicuous, a book of this sort is a welcome addition to 
"the literature. It abounds in suggestions and ideas which should stimulate 
s "more research in this area. 
M The spirit of the book is commendable. It attempts to give a general ra- 
~tionale for discriminating between different random structures which might 
be regarded as having generated the same observed‘time series. The null 
i "hypothesis and the alternatives are always stated explicitly. Then tests are 
E constructed which presumably are optimal in some sense» Some ingenious 
E "ideas and devices are propounded but Whittleds somewhat carried away in 
— his enthusiasm, with the result that clarity and. zigor are sometimes sacrificed « 
S for expediency. The reader of this book should be warned that this is not a 
: book which may be read uncritically, but rather one which should be read 
© with caution and reserve. E Р 
z The first two chapters comprise a brief review of some important results 
od Statistics and probability theory, with a few gketchy proofs. Chapter 1 
у Outlines the testing of hypotheses and the construction of a most powerful 
Р -Critical region by means of a sufficient estimator. The second chapter re- 
y . views the notion of a spectrum for a stationary stochastic process and gives 
© the Corresponding spectral expansion of the process. The discussion centers 
. mainly on a discrete process as this is the type censidered throughout the 
К book. By restricting the spectral density to be a rational function of z=e*, 
— it ig shown that the corresponding stochastic process is either an auto- 
k- T coi Scheme or a moving average scheme or a certain generalization of 
. these, 2 © 
| In Chapter 3 a most powerful test is constructed, on the assumption of 
An underlying normal distribution, for testing whether N consecttive obser- 
о ө @ 


m 


"S a * е а 


aif ^ 


£ f 


- 198 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


vations have a particular covariance matrix. The test criterion is a ratio of ` 


quadratic forms, and is the same as that given by Lehmann and Stein 
(Annals of Mathematical Statistics, 1948, p. 504). If the covariance matrix is 
assumed to be a Laurent matrix (as is the case for a stationary process), and 
N is large, a test function is constructed which is a ratio of linear functions of 
the empirical covariances, and which is simpler to compute than the afore- 
mentioned ratio of quadratic forms. The large sample distribution of the test 
is given and Whittle states that this test in the case of a stationary process 
has “practically the same power” as the exact test mentioned above. The 
reviewer cannot resist wondering what happens in the case of small or mod- 
erate values of V, since both the construction of the test and the distribution 
of the criterion assumed large values of N. Li 

Chapter 4 gives some ingenious approximative methods for getting the 


inverse of a Laurent matrix, also its latent roots. Circulant matrices are used ў 


in the approximation and the spectral density of the process is elegantly ap- 
plied. It is not at all clear, however, how good are the approximations. As for 
the approximate distributions of quadratic forms and ratios of quadratic 
forms given in this chaptersthe reviewer would like to make a few comments. 


well in the numerical example given in Chapter 5, but as a genera] method it 
has some inherent difficulties which might, be pointed out here. In the first 
. place, the problem of finding the moments ls, in general, prohibitive. For 
the special cases considered by Whittle the distribution is required for inde- 
pendent variables, and the denominator is such’ that the ratio is distributed 
independently of the denominator. These cireumstances greatly simplify the 


expansion is suspect. If the range of the ratio is finite then a condition of 
Cramér’s is satisfied which assures that the Gram-Charlier series converges 
to the distribution function. If the Tange is not finite, then the same ques- 
tions as above Tegarding the convergence and the validity of the asymptotic 


s 


- expansion rjply. 


C————— 


BOOK REVIEWS 199 


Some further remarks concerning this chapter are apropos regarding the 
assumption of circularity. A rather sweeping statement appears in referring 
to В. L. Anderson's distribution, to the effect that the assumption of circular- 
ity is “really no great drawback as N would have to be quite small before the 
power would be seriously diminished by this assumption." What is meant by 


` the terms “really no great drawback,” “quite small," “seriously diminished,” 


is vague here. If N must not be “quite small” for the assumption of circular- 
ity to be “seriously” questioned, then one could ask whether ог not the nor- _ 


` mal distribution is an adequate approximation. If, besides the circularity 


assumption, the charabteristie function is “smoothed,” as suggested by. 
Koopmans (Annals of Mathematical Statistics, 1942) or more generally as ex- 
tended by Whittle, then ойе may well wonder how far astray the resulting ap- 
proximation is from the original distribution before circularity and smooth- 
ing were applied. How large or small N must be and what the corresponding 
effect will be on the power and on the true distribution is indeed a moot 
question and one which, in the preBent stage of development of the theory 
is usually answered by conjectures which to this reviewer seem unduly 
optimistic. > 
Chapter 5 provides a numerical example for a test of randomness against 
certain alternative hypotheses. The approximative methods developed in 
the earlier chapters are used, and seem to work quite well for this example. 
Chapters 6 and 7 are entitled “Non-parametric Discrimination” and are 
devoted mainly to the problem of constructing suitable tests regarding the 
structure of a process. The title of these chapters is muisleading because the 
tests are constructed from a probability density mvolving unknown parame- 
ters and, as such, are parametric tests in the conventional sense of the term. 
Whittle, in fact, assumes the parameters have a probability distribution and 
proceeds to apply Bayes’ theorem, to construct a posteriori likelihood func- 
tions, then uses a likelihdod ratio of such functions, It issurprising that such 
an anachronistic approach could have found its way in so retent and other- 
wise modern a, book. By assuming that,N is large and choosing а convenient 
distribution for the parameters yarious test criteria are obtained. Many of 
the tests could be Sonstructed directly without appealing to Bayes’ theorem. 
Among the tests considered are the following: (a) Test the order of a moving 
Average scheme against the alternative of a different order. (b) Test the 
order of’ an autoregressive scheme against the alternative of a different 
order. (c) Test whether a process is an autoregressive scheme of a fixed order 
against the alternative of a moving average scheme of a fixed order. (d) 
Opposite of (c). ee EE. : 
In Chapter 8 numerica] examples of (c) and (d) are discussed and in . 
Chapters 9 and 10 the methods of the earlier chapters are applied to con- 
struct periodogram tests and tests of fit, respectively. © е 
The final chapter “Indeterminacieg in model structures” provides an inter- 
esting investigation into the non-uniqueness of the linear structure of a 
stochastic process for s given covariance function. In the case of a 


Р на 


с É 


200 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


process, the indeterminacy is inherent; however, if the process has some non- 
zero cumulants of higher order than the second, a method of discrimination is 
proposed, : 

Tn conclusion, this reviewer would like to quote from F. N. David's review 
of this book which appeared in Biometrika, Vol. 39, May, 1952. ^. . . How- 
ever it is more than sufficient to say that Mr. Whittle is a pioneer and it 
has always been the fate of pioneers both to stimulato those who follow and 
to be criticized by those who are wise after the event. All who are interested 
in time series will benefit by reading the book if only from the stimulation 
and excitement which come from trying to go one better than the author. 
This is an important contribution to the research work on time series and 
may well prove to be the foundation stone of a satisfactory theory.” 


Tables of Poisson Distribution. Tosio Kitagawa. Tokyo, Japan: Baifukan, 
1952. Pp. хіі, 150. $3.50... ^ 


^ 
WinuaM G. Cocuran, Johns Hopkins University 


du tables give the individual terms e-"m*/z! of the Poisson distribu- 
tion. Unlike E. C. Molina’s tables (Poisson's Exponential Binomial 
Limit, D. Van Nostrand, New York, Fifth printing, 1949), they do not 
contain eumulative sums, and they stop at m=10, whereas Molina goes up 
io m-100. Howevet, the interval of tabulation 4s much smaller than 
Molina's, being only 0.001 ih m up to m=1, and thereafter 0.01 up to m —10. 
The following table сотарагев the intervals available and the number of 
decimal places (D.P.) given by each author. 


——— 

MEN oo Dui rim 
0.001- 0.010 0.001 0.001 "ES 7 
0.010- 0.300 0.001 * 0.01 8 7 
0.300- 1.000 0.001 0.1 8 6 
Zw - 5.00 + 0.01 0.1 8 6 
5.00 -10.00 0.01 PES! 7 6 
è 10.0-15.0 » none 0.1 - 6 


15 -100 Gone 


201 


inyone engaged in accurate computations with small values of m, 


as tables are а valuable addition to the library. They are attrac- 
printed, with ample separation of the figures во as to diminish eye- 
and copying mistakes, The text is in English. 

tional tables give single and double inspection plans, These plans are 
оп the same principle as those by Paul Peach (Industrial Statistics and 
ty Control, Edwards and Broughton Co., Raleigh, 1947), but were de- 
d independently by Kitagawa. If a=consumer’s risk, 8 = producer's 
Kitagawa gives tables for finding the sample size (or sizes) and the 
оп number (ór numbers) for a —0.1, В=0.1; a=0.1, 8= 0.01; а= 0.01, 
101, whereas Peach’s tables have а=В=0.05. 

2 hertsHafner, Inc.,31 East 10th Street, New York 3, inform me that 
they hope to have a supply of Kitagawa’s tables at $3.50 each. From my 
espondence with the Baifukan Company, it appears that the company 
ot wish to promote direct sales from Japan. 

—— 


0-100 Binomial Tables. Harry G. Romig (Quality Manager, Hughes Aircraft 
7 y, Culver City, California). John Wiley <? Sons, Inc., 1953. Pp. xxvii, 


sp tables show to six decimal places the individual and cumulative 
rms of the binomial distribution for probabilities from. 0.01 to 0.50 in 
eps of 0.01 (from which, of course, values from 0.50 to 1.00 in steps of 

are readily obtained) and for sample sizes from 50 to 100 in steps of 5. 
introduction defines the binomial distribution, discusses its relation to 
ypergeometrie and Incomplete Beta-funetion, explains the notation 
d in the tables, describes the procedures used in computing the tables 
"and their accuracy, and gives directions for using the tables and for inter- 
jolating into them, together with examples.” 

The Government Prfating Office has recently printed a far more extensive 
le of the binomial distribution—giving, however, only otmulative prob- _ 
lities—with entries to seven decimals for the same probabilities covered in 

g's table and for sample sizes from 1 to 150 inclusive, by steps of 1. 
parently this table (Ordnance Pamphlet ORDP 20-1, Tables of the Cumu- 
ative Binomial Probabilities, September 1952) is to be made available to the 
public, in which case a more definite notice will be included in this Review 


° W.A.W. 


3 © 

nfidence Limits Tables for Samples of Binomially Distributed Pata. John 
'olger (Chief, Technical ServiceseDivision, Human Resources Institute, Max- 

‘Well Air Force Base, Alabama). Maxwell Air Force Base, Alabama: Human 


sources Institute, May 1953. Pp, 12. e v 
ВЕ tables give 95 per cent confidence intervals for sample sizes from 5 
through 49 by steps of 1, and for all possible numbers of sucpesses. “These 


eee N 
‹ y " # ч т 


4 


202 AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1954 


confidence limits tables were prepared from Tables of Binomial Probability 4 
Distribution, National Bureau of Standards, Applied Mathematics Series 
6.” No exact definition of the confidence intervals is given, nor any further 
account of the method of computing. Presumably, however, the intervals 
are such that not less than 2.5 per cent lies in each tail. 

W.A.W. 


° ə 


Cambridge Elementary Statistical Tables. D. V. Lindley and J. C. P. Miller. 
Cambridge (England): Cambridge University Press, 1953. Pp. 35. $1.00. Paper 
bound, ә 


А SrATED in the Preface, “This set of tables is goncerned only with the 
commoner and more familiar and elementary of the many statistical J 
functions and tests of significance now available.” 
Table 1 shows cumulative normal probabilities to 5 decimals for arguments ° 
0(0.01)3.0(0.1)4 and for all arguments above 3.731, and a brief tabulation 
of the normal frequency function. Table 2 gives the one-tail percentage 
points of the normal distribution function for selected percentages. Tables 
3, 5, and 7 give percentage points of the t-, x*-, and F-distributions. 
Table 4 gives the normalizing transformation for correlation coefficients. 
Table 6 gives a means of estimating the standard deviation of a normal 
population from the range of a small sample (13 or less). Table 8 gives 4,000 
random digits. Table 9 gives the square, square root, reciprocal, reciprocal * 
Square root, and common logarithm and antilogarithm of each integer to 
1000; it also gives inverse circular and hyperbolic root-sine transformations. 
` Table 10 gives logarithms of factorials to the base 10. 
It is the hdpe of the authors that “the values provided will meet the ma- 
jority of the needs of many users of statistical methods in scientific research, 
» technology, and industry in a,compact andshandy form,” and that they will 
be convenient for thé teaching and study of statistics in Schools and uni- 
versities. 3 
° „ MAL. 


° » 


County and City Data Book, 1952. A Statistical Abstract Supplement. Prepared 
under the direction of Morris B. Ullman (Chief, Statistical Reports Section, 
Bureau of the Census). Washington: United States Government Printing Office 
1953. Pp. xxx, 608. $4.25. 5 


JA eni to the Introduction, "This volume is one of a series of supple- | 

A men? to the Statistical Abstract of the United States, and is designed to | 
moet the need for summary statistics for small geographic areas, Compactly 
assembled in this volume are 128 items of data for each county, standard 
metropolitan area, State, and geographic division; and 133 items of data 
for each of 484 cities having 25,000 or more inhabitants in 1950, Also in- 
cluded is a table showing the number бї inhabitants of all urban places 


о ое Е 


<= * n m. 


Ж\з 


— BOOK REVIEWS eS - 203 


(mostly incorporated places of 2,500 inhabitants or more) in 1950... . 
The year, 1952, used to designate this edition denotes the year during which 
compilation of the statistics occurred.” The title page notes “Statistics in- 
_ cluded: For 1950, Agriculture, Area and Population, Banking, City Govern- 
ment Finances and Employment, Construction, Education, Family Income, 
Housing, Labor Force, Vital Statistics, and other subjects; for 1947 and 
1950, Manufactures; for 1948, Trade and Services; and Climate.” . zt 

Ү.А. 


Bibliographie sur la méthode statistique et ses applications. б. Darmois and 
E. Morice, gditors. Paris: Institut International de Statistique and Institut Na, 
tional de la Statistique et des Etudes Economiques, 1952. Pp. 49. Paper bound, 

‹ ed 


In bibliography lists 75 works dealing with statistical method and its 
> 4 applications which have been written in or, in two instances, translated 
into French, the majority within the last twenty years. The introduction 
apologizes to authors whose works may not have been cited, explaining that 
“the bibliography could not be exhaustive. ^. i 
The bibliography has been divided into two sections; (1) General Meth- 
ods, and (2) Applications. Under the first heading are (a) elementary works, 
(b) intermediate works, (c) advanced works on theory, (d) elementary 
probability, and (e) probability theory. The second section includes: (f) 
economics and insurance, (g) industry and agriculture, (h) demography, 
(i) medicine, biology, and psychology, and (j) mechanics and astronomy. 
i Besides the usual bibliographical details of author, pfublisher, date of pub- 
= lication, number of pages, and price, this biblibgraphy gives the table of 
є) * contents of each work, with a summary of what is included, in леви 
э: A.L. 


um L е 


PUBLICATIONS RECEIVED 


Allen, R. G. D., and Ely, Edward, Eds. 
International Trade Statistics. New York: 
John Wiley and Sons, 1953. $7.50. 

American Society for Quality Control. 
Definitions and Symbols for Control Charts. 
New York: ASQC (70 East 45th St.), 1953. 
Paper. 50 cents. 

Anderson, Oscar Edward, Jr. Refrigera- 
tion in America. Published by the Princeton 
University Press for the University of Cin- 
cinnati, 1953. $6.00. 

Bross, Irwin D. J. Design for Decision. 
New York: The Macmillan Co., 1953. 
$4.25. › 

Bureau of the Census, Farms and Farm 
People, Washington, D. C.: U. S. Govern- 
ment Printing Office, 1952. Paper. 50 cents. 

Burns, Arthur F. Business Cycle Re- 
search and the Needs of Our Times. 33rd 
Annual Report. New York: National 
Bureau of Economic Research, 1053. Paper. 

Canada, Department of Trade and Com- 
merce. Private and Public Investment in 
Canada; Outlook 1953. Ottawa: 1953. Paper. 

. Supply of Building Materials in 
Canada; Outlook 1953. Ottawa: 1953. 
Paper. 


Charnes, A., Cooper, W. W., and Hen- 
derson, A. An Introduction to Linear Pro- 
gramming. New York: -John Wiley and 
Sons, 1953. Paper. 


in Cincinnati and 
Hamilton County, Ohio. 1950-1952. Cincin- 
nati: Council of Social Agencies, 1952. 
Spiralbound. © 

Committee far. Economic Development. 
Flexible Monetary Policy; What it is and 
how it works. New York: March 1953. 
Paper. 

Connolly, T. G., and Sluckin, W. Statis- 
lica for the Social Sciences. New York: Haf- 
ner Publishing Co., 1953. $2.75. 

England and Wales, The Registrar Gen- 
eral/s Statistical Review of, for the Year 1951. 
London: Н.М. Stationery Office, 1953. 
Paper. 5 s. 

Ferber, Robert. The Railroad Shippers’ 
Forecasts, Urbana: University of Illinois, 

-1953. Paper. $1.00. 

Festinger, Leon, and Katz, Daniel, eds, 
Research Methods in Behavioral Sciences, 
New York: The Dryden Press, 1953. $5.90. 

_Gilbert, Milton, ed. Income and Wealth, 
Series III. Cambridge, England: Bowes 
and Bowes, 1953. 35 s. 

Hansen, „Ама H., and Clemence, 

7 


4 › i 


Richard V., eds. Readings in Business Cy- 
cles and National Income. New York: W. W. 
Norton and Co., 1953. $5.25. 

Hansen, Morris H., Hurwitz, William N., 
and Madow, William G. Sample Survey 
Methods and Theory. New York: John Wiley 
and Sons, 1953. Vol. I, $8.00; Vol. II, $7.00. 

Hertz, David B., ed., and Rubenstein, 
Albert H., ass't ed. Research Operations in 
Industry. New Yorx: Columbia University 
King's Crown Fress, 1953. $8.50. 

Hickman, W. Braddock. The Volume of 
Corporate Bond Financing, since 1900. 
Princeton: Princeton University Press, 
1953. $7.50. 

Hilton, P. J. An Introduction to Homoto- 
py Theory. Cambridge, England: Cam- 
bridge University Press, 1953. Paper 15 s. 

*Katona, George, and Mueller, Eva. Con- 
sumer Attitudes and Demand, 1950-52. Ann 
Arbor: University of Michigan, 1953. 
Paper, $1.50; cloth, $2.00. 

Kenya Colony and Protectorate, Report 
on the Census of the Non- Native Population 
of, Taken on the Night of the 25th February 
1948. Nairobi: Government Printer, 1953. 
Paper. 12/50. 

Kuhn, H. W., and Tucker, A. W., eds. 
Contributions to the Theory of Games, Vol. 
ti. Princeton: Princeton University Press, 
1953. Paper. $4.00. 

Kyrk, Hazel. The Family in the Ameri- 
can Economy. Chicago: University of Chi- 
cago Press, 1953. $6.00. 

Longley-Cook, L. H., and Hooker, P. F. 
Life and Other Contingencies, Vol, I. New 
тан СашЬг ре University Press, 1953. 


МасМіесе, Е. Н. Industrial Specifica- 
tions. New York: John 
1953. $4.50. 

Merriam, Ida C. Social Sscurity Financ- 
ing. Bureau Report 17, Social Security Ad- 
ministration. Washington, D. С.: U. 8. 
PU ERE Printing Office, 1952. Paper. 

.00. 

Minnesota, University of, Industrial 
Relation3 Center. Compensation Principles 
and Practices. Research and Technical Re- 
port 13. Dubuque, Iowa: Wm. C. Brown 
Co., 1953. 3 


Wiley and Sons, ` 


Minnesota, Social Science Research · 


Center of te Graduate School. The Garri- 
son State, its human problems. Minneapolis: 
University of Minnesota, 1953. Paper. 
National Bureau of Standards. Hyperge- 
Smetric. and Legendre Functions With Appli- 
cations to Integral Equations of Potential 


204 


PUBLICATIONS RECEIVED 205 


Theory. Applied, Mathematics Series 19. schafi, 1950/52. Zurich: 1953. Paper. 
Washington, D. C.: U. 8. Government Tintner, Gerhard. Mathematics and Sta- 
Printing Office, 1952. $3.25. tistics for Economists. New York: Rinehart 
. Simultaneous ® Linear Equations and Co., 1953. $6.50. 
and the Determination of Eigenvalues. Ap- United Nations, Statistical Office of. 
plied Mathematics Series 29, (Edited by Commodity Indexes for the Standard Inter- 
L. J. Paige and Olga Taussky.) Washing- national Trade Classification; Preliminary 
ton, D. C.: U. S. Government Printing Issue. Statistical Papers Series M, No. 10. 
Office, 1953. $1.50. F A New York: Columbia University Press of- 
. Table of Natural Logarithms for ficial distribution agent in the U.S.A., April 
Arguments Between Zero and Five to Sixteen 1953. Paper. $5.00. 
Decimal Places. Applied Mathematics . Statistical Yearbook, 1952. New 
Series 31 (reissue of Math. Table 10.) York: 1952. Paper, $6.00; cloth, $7.50. 
Washington, D. C.: U.* S. Government Yearbook of International Trade 
Printing Office, 1953. $3.25. Statistics, 1952. New York: Columbia Uni- 
. Tables of Bessel-€lifford Func- versity Press official distribution agent in 
tions of Orders Zero and One. Applied Mathe- ^ the 10.8.А. 1953. Paper. $4.00. 
matics Series 28. Washington, D. C.: U. 8. United Nations, Technical Assistance 
Government Printing Office, 1953. Paper. Programme. The National Income of the 
45 cents. М Philippines and its Distribution. New York: 
. Tables of Chebyshev Polynomials Columbia University press official distribu- 
S,(z) and C,(z). Washington, D. C.: USS. tion agent in the U.S.A., 1952. Paper. 40 
Government Printing Office, 1952. $1.75. cents. 
Northcott, D. G. Ideal Theory. London: Virginia ePolytechnic Institute, Bulletin 
Cambridge University Press, 1953. Paper. ^ of the, Vol. XLVI, No. 4. The Department of 
Organisation for European Economic Co- Statistics and the Statistical Laboratory. 
operation. National Accounts Studies; Nor- Blacksburg, Virginia: February 1953. 
way. Paris: OEEC (Columbia University Paper. 
Pio cou distribution agent in U.S.A.), Walker, Helen M., and Lev, Joseph. 
une 1953. Paper. $2.00. bes; ” : 
Quebec, Statistical Yearbook, 1901-68. 80а, ОСЫ un v 
дер Сел. of Trade and Сош- "World Health Organisation. Annual Bpi- 
Railroad Retirement Legislation, Report demiological = Уча) шинче, 
of the Joint Committee on. Retirement Vital Seer ind eas ог не 02, 
Policies and the Railroad Retirement Sye- Geneva: 195g. Paper. $5.00,е 
d ; i d . Part II. Cuses of and Deaths 
tem. Part 1; Issues in Railroad Retirement: ; 5 1947-49. Geneva, 
Part 2; Economic Problems of an Aging from Notifiable Diseases, E У 
Population. Washington, D. C.: Us §. 1953. $75. as 
Government Printiflg Offic® 1953. Paper. Woytinsky, W. 6., and Mer AEN "n 
Schultz, Theodore W. The Economie Or- World Population and рәш fiis 
ganization of Agriculture. New York: Mo- and Outlook. New York: The Twentie 
Graw-Hill Book Co., 1953. $8.50. e Century Fund, 1953. $12. a 
Schweizerische Gesellschaft für Sta- Yokahama Mathematical Journal, Vol. 
tistik und Volkswistschaft. Schweigbrische 1, No. 1. Yokahama, Japan: May 1953. 
Bibliographie für Statistik und Volkswirt- Paper. 
e 


Correction for publication listed in December, 1953 (vol. 48, no. 204): 
Hood, William C. and Koopmans, Tjalling C., eds. Studies in Econometric Method. 
Cowles Commission for Research in Economics, 1953. 

e 


ё 


RANDOM DIGITS (12,876-15,125) 


From A Million Random Digits, to be published by the Rand Corporation, Santa Monica, California, 


Ив 8126-12,875 (erroneously labeled digits 9001-13,750) were published in 
pe a eet 48, p. 931 (December 1953) 


59748 72905 18532 10721 22029 
06364 17893 86689 39755 39547 
89772 55293 52849 31052. 79655 
24405 15409 33298 87632 61849 
09657 36850 81569 83651 85795 
48524 47778 80692 85476 23790 
00740 36666 02680 21904 18370 
11250 18463 86989 12625 - 30635 
60923 68685 94994 87175 88318 
40168 33501 59817> 58830 00157 
33842 89565, ° 53359 13062 97614 
97035 57905 09581 25343 17033 
56429 73216 12342 14486 76624 
63242 96 56690 42093 35660 
65814 21118 22140 36636 18291 
' 02370 42100 73370 11944 85727 
68037 41963 03874 44856 49762 
00405 62369 55080 61880 26027 
00234 14705 , 93418 94084 16525 
91656 , 98079 52384 43306 31948 
33557 87793 90857 10143 46726 
91408 80220 .^ 05728, 68890 46577 
50106 10099 13722 19572 » 44004 
57782 ., 63951 53723 86853 63851 
76162 71724 40028 94786 34457 
> с 

95270 96584 81907 , 04055 53990 
53445 67097 95523 66568 7 "63632 
77385 29911 65690 | 41178 47712 
23854 _ 34784 70950 54680 578H 
72119 45668 03459 29870 78252 
85349 93335 ? 86853 15860 

71048 80847 75608 39646 90871 
70255 $1083 58581 44364 57468 
53782 55926 64013 63562 41388 
70088 12237 47838 ^ 46712 39848 
‚ 81167 91259 14721 41014 97025 
70284 82100 11669 02629 . " 49845 
21539 . 13042 76431» 78515 02624 
32397 ~ 74152 54102 - 80832 59979 
35083 65927 , 95061 ' 16625 77086 


206 


RANDOM DIGITS 


08511 


39746 


75249 
53163 
27236 
57532 


20883 


92132 
03409 
29678 
91889 
90927 


60819 


67431 
09285 


73889 
93877 
59141 
40998 
20279 


49717 
69825 
82990 
19929 
80825 


12654 
71670 
20802 
14097 
57951 


^ 02606 


82021 
88514 
66840 
52989 


75579 
33354 
53230 
06558, 
54404 


73719 


02446 


57886 
14892 
76489 

« 57694 
"04658 


. 


IN ONE ENCYCLOPEDIC VOLUME 


A detailed survey of the world's economy: 
ifs needs and resources — and ifs potential 


1340 pages of two-column 
text 


497 tables 


Nearly 200 maps and 
0 illustrations 


156 graphs 


Durably bound in 
buckram 3 


Size: 7” x 1014” x 21%” 
$12.00 


—————— 


-.. and check this list: 


RECENT TWENTIETH CENTURY 
FUND STUDIES 


FARM POLICIES OF THE 

ITED STATES: 1790-1950 
By Murray R. Benedict $5.00 
EMPLOYMENT AND WAGES 
: am IN THE UNITED STATES 
By W_S. Woytinsky & E. S, Woytinsky ВУ W. 8. Woytinsky ё Bur. 
At a time when industrialization is fast be- 
coming universal, this gigantic уо marshals Sah ae 98 СРЕЗ $2.50 
the facts of social and economic forces. and v EE rei ^ 
trends on a world-wide scale! The result is DEFENSE AND THE 
a prime source book, a basic reference vitally DOLLAR: 
important to our future. КУ Federal “УШ ааа Mouse 

‘olicies 

lt covers Population—past, Present, апі By Albert G. Hart $2.00 
Projected future—by countries and by. conti- 
nents; their consumption needs and patterns j^ AND OMIC CONTROLS 
their resources in men, materials and tech- By Donald H. Wallace $2.00 
nology; data by coantry on farming, energy Lit 9 Р 
production, mineral reserves, manufacturing, THE FEDERAL DEBT: 
with major economic trends Projected, The Structure and Impact 
“material is completely documented and is il- * By Oharles Cortez Abbott $4.00 


lustrated with tables, graphs, charts and maps. 
At zh bookstores, or К 
CREE - THE TWENTIETH CENTURY FUND 
- 330 West 42nd Street 
New York 36 


ile AMERICAN STATISTICAL. ASSOCIATION in writing advertisers 


Bs 


Please montion the Journ V of 


PY 


l 
! 


JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Number 266 . JUNE 1954 Volume 49 


MEASUREMENT FOR ECONOMIC MODELS* 


STANLEY LEBERGOTT 
Bureau of the Budget А 


Three chief subjects are considered. (1) How accurate 
should data be for economic analysis via economic models, 
particularly input-output models? Preference is expressed for 
measurement of additional aspects of economic phenomena 
rather than broadside improvements in generally sound series. 
(2) How can models be developed to utilize imperfect data? 
Several suggestions are made: (a) examine the concepts and 
methods lying behind the basic data; (b) question models 
acutely sensitive to the inclusion of a gingle observation; (c) 
evaluate the economic meaning of models after empirical 
testing; (d) set up check models. (3) How can more adequate 
data be developed? A proposal is olitlined for an integrated 
set of data on financial aspects of business, and for similar 
integration of daéa on employment and on consumer eco- 
nomic behavior. * 


Bgn 

MA than half a century has passed since Marshallfirstsketched —. 

a system of equations which could be developed “until they em- 
braced within thémselves the whole of the demand side of the problem 
of distribution.”! In that periodethe development of systems of equa- 
tions, or models, has become a commonplace in economie study. Have 
our measurements kept pace with the demands which such systems 
place upon them? Three central questions can be set out for considera- 


tion: © 


1. How accurate should economic data be for model building? be s 
2. How can models be developed to utilize imperfect data? - 


3. How can more adequate data be secured? e e 


. * A paper presented at the annual meeting of the Econometric Society, December, 1052. The 
Opinions expressed are not necessarily those of the Bureau of the Budget. — 
1 Principles, (London, Macmilla£, 1916) Mathematical Appendix, Note xiv 
. 


209° s 


е 
f 2 
e 
2 


210 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 
I 


How accurate should economic statistics be’ for model building? 
Clearly we do not wish them to be as accurate as possible: no one is 
prepared to devote the time and money required to achieve this end. 
We could, for example, immediately improve our estimates for gross 
national product, gross private domestic investment, new construction, 


personal consumption expenditures, incomes of unincorporated enter- , 


prise, personal savings, net realized farm income, ‘index of prices paid 
by farmers, parity income index, and a host of other series if we had 
substantially better data on how much farmes spend for hammers and 


data on food consumption—Department of Agriculture estimates of 
food disappearance, the food component of the gross national product 
series, and the Census of Business data on food sales at retail—if we 
had & comprehensive. Study of food consumption outside private 
households. These projects, however, certainly lack the glamor and 
usefulness of many other projects; and it is most unlikely that funds 


curate statistics should be. Nearly half a century ago, for example, 
Alexander Dana Noyes analyzed a period of recession, using statistics 
Which reported a 50% decline in iron output, a 25% fall in textile out- 
put, a 55% tise in commercial failures and a 12% drop in railway 


v 


MEASUREMENT FOR ECONOMIC MODELS 211 


nificant difference in the ability with which businessmen or government 
officials can evaluate,economic trends? 

Is the issue any clearer for model builders? Professor Morgenstern 
has pointed out as “obvious” that in large scale numerical operations 
such as those involved in inter-industry analysis “only very high 
quality data” should be used—given the extent of the numerical op- 
erations and their considerable cost.* 

In actual practice how should such a prescription be applied? One 
of the dubious rows im the comprehensive and invaluable 190 by 190 
inter-industry matrix which the Bureau of Labor Statistics recently 
prepared is that which dfstributed the output of “other repair services.” 
The row for retail trade is only somewhat less dubious. Statisticians in 
the BLS and elsewhere have clearly recognized the inadequacies of 
these data and insisted on the desirability of improving them. Yet in 
the actual use of the model can we expect different policies to be pur- 
sued, or even suggested, if the aggregate demand for “other repair 
services” or for trade services differs by 1%—by 10%—by 20%? 
Probably not. On the other hand variations of 10% in the projections 
for copper demand might well be significant. Proceeding further, does 
this mean that all the coefficients indicating demands for copper by 
particular industries should be improved? Not necessarily. Of an es- 
timated 41.3 billions in gross domestic output of copper in 1947, only 
some $200,000 was used by the industrial inorganic chemical industry. 
Clearly it is more important in terms of the final use of thesmatrix to 
improve not the coefficient for copper use by this industry but rather 
for use by the insulating wire industry, which.consumed $345 millions— 
or even better, by the new construction industry which, on the crudest 
kind of data, was estimated to consume $162 millions. 

Let us look at this from another viewpoint. As the size of the chart 
grows the size ofthe average coefficient diminishes—as does the likeli- 
hood that the error in estimating any given coefficient will significantly 
distort the estimates of output which are derived from models using 
the matrix. Hence the case for broadside improvements of data de- 
creases in potency.® ki 

It is particularly unfortunate that so many important jp of the 


4 Oskar Morgenstern, On the Accuracy of &conomic Observations (Princeton, Princeton University 
Press, 1950), p. 39. s 4 

5 Offsetting errors within a closed system will, of course, further minimize errors of estimate- 
However, there is no gear assurance that they will offset within the area 6f particular interest—eg. 
demand for a particular industry's products—and a fortiori this argument does not apply in the usual 
open system, ° 


212 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


stability of input-output coefficients, of the accuracy with which pro- 
jections are made by models using input-output matrices, should have 
emphasized average errors or across-the-board accuracy.5 Let us con- 
cede that it is useful to possess such overall measures. At the same time 
we must note that these measures are unrealistic—unrealistically 
harsh on input-output models as well as unrealistically easy on them, 
They are overly harsh because a complete improvement of all co- 
efficients is certainly not to be achieved: it would cost as much as 
dredging a yacht basin or similar projects to which a distinguished sena- 
torial economist has called attention. 1 

Yet at the same time they are too generous ilisofar as they permit the 
implication that errors in estimating the total output of steel are no 


more forbidding than equal errors in estimating the total output, say, 


of trade. Clearly the difference is critical if such projections are not de- 
signed as mere mathematical exercises but intended for potential 
policy use. ye 
Harold Barnett’s valuable study makes it possible to empasize this 

distinction.” From his data we can contrast how the errors made in 
projecting the actual 1950 indexes of output for all industry groups 
compare with those for subtotals of separate industry groups. Let us 
set up two categories—one to include industries for which input-output 
projections would be of lesser interest since they would probably not be 
limiting elements in a fall employment or mobilization situation— 
e.g. agriculture, food processing, etc.8 The second group would include 
chemicals, metals transport, and the more limiting industries. In the 
consumption model the average errors in projecting 1950 indices by 
the 1939 input-output matrix were as follows:" б 

All industries: 31.5 index points 

"Less limiting" industzies: 23 index points 

“More limiting” industries: »36 index points 


In the investment model the differences were even more marked; 
errors for the more critical industries were 41, or almost double the 21 
for the less critical industries, 


"СЕ. inter alia Selma Arrow, Comparisons of Input-Output and Alternati: Projectie 1929-89 
(The Hoe eis 1951). » mene 

W. W! Leontief, The Structure of the American Economy, 1919-89 (New York, Oxford University 
Press 1951), Pp. 216-18. These figures have been. superseded by a comparison for 1950 outputs using 
the 1939 matrix in an unpublished memorandum by Professor Leuntief and J. Fei. To some extent 
Waugh's analysis as applied by Christ is an exception. Cf. Part IIT of Carl F. Christ, “A Review of In- 
LG Eun ge in Zonference on Business Cycles (New York, National Bureau of Economic Re- 

"Harold Barnett, Specie Industry Output Pro; tions Rand i 4 
gives deviation in index pointe from actual 1950 belo eels a 


d Agriculture, food processing, lumber, furniture, wood and peper, printing, textile, apparel and 
nr uni » 2 


\ 
* 


I : 


і 
è 


MEASUREMENT FOR ECONOMIC MODELS 213 


The inference I would draw from all this with respect to the im- 
provement of data for input-output and similar models is that we 
attempt increasingly to separate the sheep from the goats: industries 
with reasonably well behaved coefficients where erroneous projections of 
their output would involve small losses should be estimated as ac- 
curately as possible once—and thenceforth given second place in our 
attention relative to а very limited number of industries (such as steel 
foundries, primary copper, etc.) and the vast area of guessing the 
exogenous variable’. To some fair extent this is what has been done in 
empirical research on input-output coefficients. We clearly do not re- 
quire the rhammoth sample survey of all manufacturing industries 
which has been proposed in the past in order to produce coefficients for 
all industries of “a very high quality.” 

All of which still does not solve our initial question. We may agree 
that neither improving the accuracy of all economic data nor achieving 
maximum accuracy for most data is a practical aim. But how accurate 
should any such data be? The question, I think, can only be solved in 
ambulando. There is no easy way of calculating the probable loss which 
results from errors in estimating economic data of the kind used by 
most econometricians. This is true because policy decisions do not 
usually rest on a single datum, a single series nor a single model. And 
it is even truer because errors in the choice of a model can have a far 
more critical impact than errors in the data.’ It is up to the data pro- 
ducers, model builders, and data users generally to express their judg- 
ments on what levels of error are so intolerable that they believe more 
money should be spent for improving the data. 

I should like fo expfess a general preference for sditional measure- 
ment series rather than improvements in existing series ‘or data. Let 
us grant the obvious exceptions and assume, moreover, that the data 
are collected by persons with at least a minimum budget and minimum 
statistical competence. Then I think it can safely be said that knowl- 
edge about additional aspects of our economy will generally make a 
greater contribution to sound theory and sound policy than will im- 
provements in the accuracy of existing series. : 

As one instance let me cite the patient and ingenious labors which 
produced Fabricant’s index of manufacturing output. The increases 
in accurate detail have been snvaluable. Yet so far as concerns the 
totals for industrial production—and it is ‘these which are used so 

? The war (8ге; i , the choice of consumption func(tons, 
eue жеме; көркү ааа peer [Жеш су plans, business reaction to tax 


schedules and so on. Errors in the data were of small moment. 
1 Solomon Fabricant, The Qutout of Manufacturing Industries, 1899-1987. (New York, National 
¢ * 


Bureau of Economie Research (1940). ^ 
e 


uu | 


Í 


214 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


widely in economic models and policy decisions—the resultant index 
differs little from the one which Day and Thomas developed from far 
worse data with far less work." Furthermore, neither index for total 
manufacturing would differ greatly in movement over the years from a 
very crude combination of output indices for textiles and iron and steel. 

A similar consideration is apparent when one compares the recently 
revised and unrevised FRB production indexes, wholesale price in- 
dexes, еїс.? 

The substantial differences in series occur when a responsible in- 
vestigator with a reasonable minimum of resources, cooperation and 
time first develops a series on a sound conceptual basis. Further im- 


provements may be real; they certainly are hard won and usually ex- 


pensive. They usually profit the model builder very little. 

In pressing for improved measurement in economic statistics the 
model builder would be well advised to give increased attention to the 
basic opportunity cost question. That question is this: do we improve 
our understanding of a particular economic phenomenon more by im- 
proving still further a particular series (or datum) or by adding new 
series on yet uncharted aspects of the phenomenon? 


II 


Given the present existence, and probable persistence, of errors in 
economie observation what is the model builder to do? How is he to 
build models for utilizing imperfect data? Models allowing for a 
random run of disturbances have been developed— well developed—in 
recent years and there is hardly need to recommend their use to econo- 
metricians. But if We are concerned with biased data, where acute errors 
are possible—and those not randomly disturbed—some alternative 
action is required. a 

A. The first requisite for the model builder is to examine the con- 
cepts and broad methods of estimation used in deriving the data which 
he incorporates into his model. I realize that this may be considered 
an undue burden; developing a sound model is a more than sufficient 
labor. Yet the alternative is even more unsatisfactory. 


"Е. E. Day and W. Thomas} The Growth of Manufactures, 1899 to 1923 (Washington, Government 
Printing Oftice (1928), pp. 34, 94. With 1939 +100, the 1899 estimates are 28 for Fabricant and (32) for 
M E Succeeding estimates are 34(39), 43(51)  51(55), 61(69), 54(54), 77(85), 82(88), апа 

р D 
12 The examples may be new, but the observation is an old one. Cf. Wesle i б 
А \ d . Cf. Wesley Mitchell, History of 
Prices During the War (Washington, Government Printing Office, 1919), p. 28, where he indicates the 


18 None of this, of course, bears on the question of what topics happen to evoke the abilities of re- 
A һе abili: f 

search workers: if first class statisticians are willin; to toil f ia isti ines, theirs 
Pus s sme. g (or yeails iz tLe statistical salt mines, th 


d 


hnc 


MEASUREMENT FOR ECONOMIC MODELS 215 


Let us consider, for example, the ill charted and forbidding area of 
consumption projecfions. Klein is outspoken in asserting that "to 
many of us engaged in econometric work it became obvious in the sec- 
ond half of 1947 that the most serious deficiencies in the existing 
models lay in the consumption equation and in the group of relations 
serving to determine absolute prices.’ Despite the wealth of data 
available on the subject, the relations between consumption and in- 
come continue to baffle projectors. On the one hand the budget surveys 
uniformly report thatthe proportion of income saved is greater at upper 
income levels than at lower ones. On the other hand Kuznets’ data on 
capital formation seem to indicate that the aggregate consumption 
function has not changed perceptibly in the past sixty years when in- 

*comes һауе risen so. 

Tn an acute discussion of the factors affecting consumption Professor 
Fellner has emphasized that — 

The most obvious characteristic of the historieg] consumption function, as 
calculated from Professor Kuznet's estimates, is that it does not show a 


tendency to flatten out. It tends to linearity regardless of the varying popu- 
lation growth of the subsequent historical periods.!5 


A recent path breaking study on consumer behavior similarly notes 
that “the Kuznets’ data do not show any trend in the savings ratio,” 
suggesting that even if their level were incorrect they “allow us to 
make a judgment about the movement of the savings ratio,’ 

Reference to the basic sources indicates that,to a very great extent 
this constancy exists because it was estimated that way. (And, inci- 
dentally, it was estimated that aay becausê,a sensible model indicated 
that to be the bêst method of estimate.) ; 

More specifically: the consumption function relates consumption to 
net product. The latter consists of two segments: consumption and in- 
vestment. Obvieusly consumption correlates perfectly with itself, But 
the investment segment also will reveal a high correlation with con- 
sumption because of the method of estimate. Let us consider in turn 
each of the components of the investment total. 

a) Inventory change. Changes in manufacturing, trade and “all other" 
inventories—together accounting for more than half the total inventory 
change in most years—are estimated by constant ratios to eutput.” 

1 Conference on Business Cydes (New York, National Bureau of Economic Research 1951), p. 117. 

їз William J. Fellner, Monetary Policies and Full Employment (Berkeley, University of California 
Press, 1947) p. 56. Cf. also Paul Samuelson, in Seymour Harris ed. Postuar Economic Problems (New 
York, McGraw Hill, 1943), p. 33. 

13 James Duesenberry, Income, Saving andethe Theory of Consumer Behavior (Cambridge, Harvard 
University Press, 1949), p. 56. 


? Simon Kuznets, Nationa}, Product Since 1869 (New York, National Bureau of Economic Re 
search, 1946), pp. 109, 110. 1 


216 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


Now since a heavy proportion of total output is output of consumer - 
goods, this method of estimate means that these inventories will neces- 
sarily fluctuate with consumer expenditures. Changes in inventories of 
livestock represent a further link, since livestock slaughter (a com- 
ponent of consumer expenditures) was estimated, in the original De- 
partment of Agriculture source, by applying ratios to inventories! 

b) Construction, The output of construction materials was used (with 0 
a weight of one, plus semi-durables with a weight of two) to interpolate . 3 
estimates of consumer durable output between census dates.? This isa _ 
further link between the consumer and producer segment estimates. 

€) Producer durables. The basic trends for producer, as for consumer 
durables, derive from the 1869, 1879, and other Census date totals esti- 
mated in William Shaw's comprehensive and fundamental study. For 
most durable categories Shaw extrapolated 1914 and 1909 totals to 
earlier years by using a constant ratio to Split Census group totals 
between consumer and ptoducer durables. The procedure assumes, for 
example, a constant ratio of consumer to producer goods in the output 
of furniture, sewing machines, foundry and machine shop products, 
ete. Thus 67.7% of all sewing machines were assumed to be household 
machines in every Census year from 1869 through 1909, with the bal- 
ance classed as business use.” Other constant ratios (under 100%) were 
used for furniture, heating and cooking apparatus, appliances, cutlery, 
etc. For an important class of items Shaw made the reasonable assump- 
tion that hone were bought by producers, all by consumers, at each of 
the Census dates. These include “family and pleasure” carriages and 
wagons, pianos, organs, rugs, glassware, books, etc.” 

So far as concerns the level of commodity flow results, or even many. 
broad economie trends these decisions raise no question: a responsible 
estimator cannot arbitrarily vary Tatios without some data to which to 
tie. But if we are concerned with something as precise'as the constancy 
of the historical consumption function, we should decide how critical 
was the decision to assume that the rising number of apparel firms 
did not affect the proportion of all Sewing machines bought by business, 
or the assumption that the growing number of restaurants and hotels 
did not change the proportion of all dishes and cutlery they bought, 


15 Г, Strauss and L. Bean, Gross Farm Income aná Indices of Farm Production and Prices in the 09 
United Site, 1800-1087, (Washington, Government Printing Office, 1940), p. 106, 
++ р. 95. 
7? William Н. Shaw, Value of Commodi? i ET 
saxis i, 1047) of lity Output Since 1869 (New York, N:tional Bureau of Eco- 
з Tbid., Table TII. ў Т E 
E n et: N. Shaw, op. cit., p. 161. For 1879-1903, 8.17% of the reported total is excluded as 
+ * Ibid., Table Ш and Note B to Table III. 


MEASUREMENT FOR ECONOMIC MODELS  . 217 


etc. When we come to use the data in economic models, we must not 
try to explain any constancy of savings ratios in final series if it rests 
ultimately on the constancy of arithmetic ratios used in the estimating 
process. 

B. In addition to a careful examination of the methods of estimate a 
second step is to question models which are acutely sensitive to the 
addition or subtraction of a single observation—such as that for a 
single year or a sipgle industry.* 

A recent model which appears to be in this category is Modigliani’s 
savings model. Modigliani defines consumer saving as a function of 
the current year’s income and highest previous year’s income as meas- 
ured by a cyclical income index. The model provides a beautiful fit for 
the 1921—40 period. If, however, we simply add the values for 1941, 
as provided in the appendix to ,Modigliani’s study, we derive an esti- 
mated 1941 value so far in error that it cannot even be plotted on the 
chart he shows.” to 

His particular equation for allowing for the influence of previous 
years’ incomes essentially states this—that for the entire decade of the 
1930’s the consumers continued to hark back to the halcyon 1929 level 
of incomes in deciding how much money to spend. However, the same 
model indicates that the consumer’s frame of reference—say for decid- 
ing on his 1937 saving—would swiftly move forward six years from 
1929 to 1936 if the 1936 total were a mere 4% greater.*” (It would inci- 
dentally, take а courageous man to assert that Commerte 1936 in- 
come estimate is accurate to within 4%). . i 

A fairly simple method to*keep from implying that such drastic 
changes in consumer behavior occur when a single observation is 
changed slightly or added to previous ones would be to drop what is 
essentially a function based on r&nking in favor of one based on & 
direct measure of magnitude? The latter would be less аб the mercy of 
minor variations in source reporting or estimating techniques. — 

м Carl ist wi " Я iomio rela~ 
sos d Ci wl painta ont ipat it ther таз a i te dion of риу амил di, Оша 
might add that economie relationships generally change во slowly that we should question both the 
data and the model, before we conclude that a sharp change in relationship bas, in fact, 

3 Franco Modigliani, “Fluctuations in the Saving-Income Ratio: A Problem in omic Fore- 


cating? in States n Income and Welt, (ew York, National Burea of Beonomio Mure ripa 
. XI. 

% Franco Modigliani, op. cit. p. 380. 

зт The only congolation indicated is that since 1941 income levels were above any prior year 4020- 
40 the gain is defined as being a secular, and not a cyclical one. Consider the economic meaning of this 
If per capita income rises $100, this ia $100 of Bylio increase; if it rises $101, none of itis cyclical, 

ч A distributed lag approach yas wisely suggested by Leontief; Modigliani noted thet he had tried 
this and secured worse results. But less precise correlations based on due recognition of data shortcom- 
ings are to be preferred to the converse. RES UM. $ 


218 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


limited information version of the important Klein model III is almost 
equally delicate. By adding two observations to the 21 year run from 
1921 through 1941—which had included short recessions, deep depres- 
sion, and continued prosperity—Christ obtains a projected 1948 de- 
mand for labor which Klein reservedly calls “fantastic”—a comment 
echoed by Christ.?? This, moreover, is no trivial or paltry incidental 
equation: it is fundamental to the model. It seems a striking fact that 
the disturbances which were trivial and random in Klein’s limited in- 
formation estimates become substantial and nonrandom in Christ’s 
versions: they are negative for 1924, 1927, 1930, and other recession 
years, but positive (with one exception) for all other years.*9 

Christ assumes that the problem lies in the limited information 
method at least as much as with the equation. There is merit in this 
argument—but not enough. While the limited information solution 
actually blows up the standard error of disturbances even for the 
least squares solution is,sufficiently disturbing—rising from .96 to 1.47 
when Christ adds these few years to the original run. Klein favors errors 
in the basic data—particularly pointing to inadequacies in the BLS 
consumers price index. While the price data are dubious this is only 
part of the answer.* It would appear that the chief difficulty lies in the 
fact that Christ’s calculations imply only a $9 billion rise in consumer 
expenditures from 1946 to 1947, whereas the Commerce data (pub- 
lished after his original work) report on $17 billion rise—reason enough 
to show the inconsistency between output and employment data which 
he and Klein diseuss.? What is more important than the particular 
reason is the fact that anyssystem whieh is so sensitive to the addition 
M observations would seem to fall into the “handle with care” 
class. 

C. There is a third consideration for model builders to bear in mind 
when using data—particularly imperfect data. And that is the need to 
evaluate the economic meaning of econometric models after empirical 
testing. Just as we will accept a sampling procedure which gives 
biased results provided the total errors are considered satisfactory so 


ER RRS ERGO med erp OST 


e Carl Christ, “A Test of an Econometric Model for the United States, 1921-1947” in Conference 
on Business)Cycles (New York, National Bureau of Economic Research, 1951), p. 124. 

Y Tbid., р. 105. 
4 E MEDIA and BAE sources which Christ notés (ibid., p. 91) опе can estimate his price rise 
irom. to 1947 at 16%. This may be compared with the 10! she the 
1951 National Income Supplement, p. 146. oc 

a Taking Christ's consumption data from his study (ibid., р. 90) and applying the 16% rise com- 
puted above indicates a rise in consumption in current prices of $9 billions—as compared with a rise of 


Sa 


MEASUREMENT FOR ECONOMIC MODELS 219 


we should prefer a model with greater resemblance to the economic 
world instead of ong that has somewhat less error and distinctly less 
meaning. 

It is instructive on this point to consider the recent study in which 
Tobin sets up a sensible model of the food market, combining budget 
data and time series data wjth equal parts of care, agility and aplomb.” 

Tobin posits the following model of the retail food market ** 


© 8,2 KY fs Y «РО 


where S' is per capita food supply for domestic consumption, Y! is 
disposable income, P represents food prices and Q, all other prices. 

Converting to a reduced form and rewriting, he developed his sys- 
,tem as follows: 


log P, = bo + bi(log 8, — о log Ү,) 
+ bilog У, — log У, 1) + bs log Qi. 


On a subsequent page, Tobin computes values for the parameters using 
various Agriculture and Commerce Department series for 1913-41. 
The values for each “b” are negative. The economic meaning of Tobin’s 
model then seems to be this: When disposable income rises food prices 
tend to fall; and when food consumption rises more than past income- 
elasticity relationships would have predicted there will likewise be a 
tendency for food prices to decline. é 

This fairly odd pair of inferences suggests, I think, the desirability 
of evaluating the economic meaning of the model and its empirical 
parameters in some detail. The fact that the food consumption series is 
far less sensitive than*the series for income or prices may offer an ex- 
planation. Its range is a mere 10% over the period Tobin uses it— 
whereas food prices and disposable, income each had a 100% range.*5 
This lack of sensitivity of the food consumption index suggests that 
some alternative measure of food consumption might properly be used 
—e.g. the deflated food expertditures figures of the Department of 
Commerce. A detailed comparison between the BAE and the Com- 
merce indices (even after adjusting for such conceptual differences at 
the Commerce inclusion of services) strongly suggests that the two 
series do not move in consistent fashion? — " o 

за James Tobin, A Statistics! Demand®Function for Food in the U.S.A.,” Journal of the Royal 
Statistical Society, Vol. CXII, Part П, 1950. 

* Ibid., p. 130. ? 
etl welleaise the question whether esseri pe is involved. Tobin carefully 

* Tobin infer that the fare to include in he series services in food distribution, would understate 
changes “in the supply of ‘finished’ foodstuffs.” But nevertheless uses it— presumably as being reason- 
ablysound.Ibid, p.131. °* * 


. 
* 7% е 


220 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


These are essentially negative cautions—though evaluating the 
methods by which his data were originally secured seems an ele- 
mentary caution for any model builder to take, however pedestrian 
and however unrewarding such an inquiry proves to be. And analysis 
of the economic meaning of the resultant model—after the coefficients 
have been determined—seems equally urgent. 

d) Positively, however, there is something which econometricians 
can do until that happy day when all economic data come from a single 
well designed survey producing data of only the highest reliability 
What can be done is to set up a check model—or models—to parallel 
the basic model. Such a model would incorporate related series or 
measurements in place of those used in their basic model. Clearly one 
would hardly seek to redo the 650 equations of the BLS 1947 matrix. 
Nor is this necessarily required. What we do need is a truncated model 
to give us some feeling for what changed inferences from the model 
would be produced by given errors in the original data. Let me be spe- 
cific. 

Tn recent models some fairly complex allowances have been made for 
changes in the distribution of income. To demonstrate that any of 
these allowances are real advances over, say, the simple procedure used 
by Tinbergen and Kalecki of distinguishing wage from non-wage in- 
come, we should have some measure of the adequacy of reported in- 
come distributions. One method would be to compare approximately 
similar distributions., Table I gives two distributions—one for 
1934-35 and one for 1935-36.7 The apparent differences between the 
1934-35 and 1935-36 distributions are great—far greater, in fact, 
than the differencés between surveys of slightly different populations 
Separated by four years (1935-36 to 1939) of rising income.** A similar 
comparison for 1949, say, can be made between the Census and FRB 
surveys, and the differences are much smaller. In both instances the 
model builder might usefully use first one, then the other distribution 
to get an indication (not a precise measure) of the impact of survey 
variations on his conclusions. 


A second method would be to make arbitrary adjustments in the 


PENS from 1940 Census, Families, Gen- 
Ж M Table Me These data relate to all families receiving wage or salary income. A dis- 

ution for all families without income other than idani 
for the $500 and over 3 Wages & salaries would be virtually identi 


Y 


MEASUREMENT FOR ECONOMIC MODELS 221 


data on the assumption of specified margins of error. For example, the 
income data reported in the BLS survey of 1941 aggregated about 10% 
less than control totals indicate was received. More recent surveys 
have run from 5 to 1097 low. What happens to our models if we allow 
for such understatements at each date, or if we presume the possibility 
that the underreporting of jncome decreased from the 1901 to the 1949 
expenditure surveys? 


TABLE I 


DENVER WAGE EARNER AND CLERICAL WORKER FAMILIES 
WITH INCOMES OF $500 AND OVER PER CENT 
E DISTRIBUTION BY INCOME LEVEL 


All Familiest 


$ 500- 9991 
1,000-1,499 
1,500-1,999 
2,000-2,499 
2,500-2,999 
3,000 and over 


$ 500 and over 


* Complete families, husband and wife native born. 

t Families receiving $500 or more in wage or salary income. 

1 Families with incomes under $500 excluded from 1934 Survey; also excluded from 1935-36 and 
1939 data here for comparability; 


§ Less than 1% б 


The measures need not be as close conceptually as this and cannot be 
in most instances. However, trends in the number of nonfarm em- 
ployees (as measured by the Census Bureau's Current Population Sur- 
vey) and in production (as measured by the FRB industrial production 
index) should be definably similar, Trends in manufacturers sales as 
reported by the Commerce sales series, and the FRB production index 
times the BLS price indexes should also bear a close relationship to one 
another and so on. The analyst can assess the result of usini variant 
measures, deciding how much,appears to be accounted for by concep- 
tual differences, how much results from the fact that he chose one 
series rather than another. t 

bA т 
? Selma F. Goldsmith, “Appraisal of Basig Data Available for Constructing Income Size Distribu- 


tions,” Studies in Income and Wealth, (New York, National Bureau of Economic Research, 1951), Vol- 
XIII, p. 285. Baa б 


$ е 


* x 


е 


222 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


An additional reason for the use of check models is that we in- 
evitably concentrate on one slice of reality, one, aspect of the phe- 
nomenon being examined when we select any given series as an ex- 
planatory variable. It is frequently not clear whether that particular 
aspect is the most relevant, real or useful one to consider. Designating 
alternative measures helps triangulate the, area of concern. As E. A. 
Goldenweiser remarked some years ago: 

no set of statistical series, to say nothing of any single series, is a sufficient 
basis for determining casual relationships on whic’ economic policy can be 


predicted with safety. They are only indications of where one,should look 
for the causes and interrelationships that determine economic events. 


The moral applies as well to the use of models for analysis. 


ш 


How can we secure more adequate statistics both for policy de- 
termination and for ecoaomic analysis by economic models?! The 
simple answer would be to pass a law—not on statistics, of course, 
but on matters requiring statistics for their determination. Certainly, 
for example, the operations of the Securities and Exchange Act have 
developed better data on financial flows than even the combined talents 
of Crum, Epstein and others had been able to produce by the process 
of making bricks from straw. The passage of the Agricultural Adjust- 
ment Act brought support for agricultural income statistics of a kind 
that was rare even in the Department of Agriculture. 

A more realistic answer to the question, however, must start from an 
answer to the question: wHat kind of.data are needed for economic 
models? The answér, I believe, is essentially the same type of data as 
is required for general economic analysis except that it is much more 
urgent that the data be consisten. It is more of a problem for the 
model builder than for the general economist that the official series on 
plant and equipment expenditures is not consistent with that for sales 
in machinery and related industries; that one government series can 
report a 7% fall in auto inventories while another reports a 0.2% rise 
for the same period; that one series reports a drop in profits from 

*? “The Economist and the State," American Economic Review, March 1947, P. 5, Cf. also Charles 
D. Stewart. pa Loring Wood, “Employment Statistics in the Planning of a Full Employment Program,” 
Journal of the American Statistical Association (September;1946). 

^! These purposes are not exclusive. In the preparation of the Federal Budget a set of economic 
Projections are made for national income, prices, employment and other materials organized as а 
model. Similar materials are prepared by the staff of the Joint Committee on the Economie Report, Cf. 
Samuel M. Cohn, “Managing tho Expenditure Side of the Federal Budget,” a piper presented before 
the American Society for Publio Administration, Washington Chapter, November 7, 1052. Joint Com- 


mittee on the Economie Report, 82nd Congress, Ist Session, (1951) The Е, i iti d 
of an Inflationary Defense Economy, Cf., especially jean Cc. ! а NEN ee 


° 


1 


y 


MEASUREMENT FOR ECONOMIC MODELS 223 


food manufacturing of 12% while another equally reputable series 
reports a rise of 20% that comprehensive reports on manufacturing 
employment for the same year may differ by several million depending 
on which series is selected; that employment data will be plant re- 
ports, production data may be company reports and profits data from 
consolidated company reports—each of these differences limiting our 
use of all three variables in model building, and so on through a catalog 
that is long, but not quite everchanging. To achieve such consistency, 
plus an improvement in accuracy, the following program areas deserve 
consideration.” 

1. Financial aspects of business. Basic benchmark data can be pro- 
vided annually for all of business, corporate and unincorporated from 
tabulations of the corporate, partnership and individual proprietorship 
income tax returns to the Bureau of Internal Revenue. Basic current 
data, used to extrapolate the benchmark data from quarter to quarter, 
can be secured from an expansion of the Firfancial Reporting Program 
now conducted jointly by the Federal Trade Commission and the 
Securities and Exchange Commission. That program, now restricted to 
manufacturing corporations, should be expanded to other industries, 
to unincorporated business. Now securing data on profit and loss, on 
balance sheet items, it could reasonably secure related data on business 
investment orders and plans for future sales and investment.“ 

The stakes for analysis here are first, acctiracy ; second, consistency ; 
and third, a considerable increase in the number of observations. In- 
stead of being restricted to one annual value for a year, only imper- 
fectly reflecting a change in economic direction, the analyst could look 
forward to from 4 to 12 observations a year for а linked set of variables. 
The result should bring a much better knowledge of economic variation 
and relationship. M 

2. Emjloym?nt. In the models developed by Hagen, the NPA, 
Fortune and other postwar projectors, in Klein's model III, and in the 
recent emergency model sponsored by the Air Force, production data, 
establishment reported employment data (from BLS), and household 


and one for government. Cf. his The Role of Measurement {п Econdmics (Cambridge, University Press, 
1951), pp. 57 fE; В. Stone, J. E. G. Utting, and J. Durbin, “The Use of Sampling Methods in National 
Income Statistics and Social Accounting, n Revue de L'Institut International de Statistique, Vol, 18, 
No. 1-2 (1950); p. 31; and corresponding comments in reporta of the UN Sub-Commiesion on Statistical 
Sampling. The consolidated surveys proposed here attempt to achieve the same goal of consistent data 
but at tho same time make use of the mass of reliable data made available regularly in the U. & from 
administrative reports. 1 j 

а Administrative relationships are not flbrticularly relevant Bere but it might be noted that in 
practice such a program would mgke use ofsuch existing collection mechanisms as the Census Bureau's 
Current Business Reporting в@пріе, ete. $ 

ee e 


^ 


224 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


reported employment data (to the Census Bureau) are all utilized, 
With labor а limiting resource in some projections; and unemployment 
an ominous resultant in others, the consistency of these data is particu- 
larly important. 

Benchmark data for the basic BIS series come from the State Un- 
employment Insurance and Old Age and Survivors Insurance Sys- 


report annual payrolls and employment on their income tax returns, 
If this were done we could then look to the monthly Bureau of 
Labor Statistics reports, like the FTC-SEC reports, as a means of 


the BLS data might be sought as in the past to apply to individual 
establishments rather than (as in BIR) to entire firms. 


3. Consumers financial activity. Moving from the sphere of business 
activity to that of Consumers we are coiifronted with similar considera- 


“BIR income tax returns do not now ещ 1 Betas: ls 
under at Teast two deduction items, report employment, and permit the distribution of pay rol 


=: 9 рторова] is discussed at greater length in the writer's “Labor Force Statistics: 
tistics: The Task 
Ahead,” a paper presented at the 1950 annual meeting of the American Statistical Association. 


MEASUREMENT FOR ECONOMIC MODELS 225 


Bureau of Human Nutrition and Home Economics and private sur- 
veys made for the Qepartment of Agriculture. The survey data in turn 
are not fully consistent with the aggregates on income, saving and 
consumption that have been derived from business records and em- 
bodied in the national income accounts. 

A program in this field,to be of value for economic models must 
provide consistent data, minimizing the number of measurements 
coming from completely different surveys. To achieve this purpose use 
can be made of the Census Bureau’s Current Population Survey for 
securing an integrated body of data on incomes, expenditures, asset 
holdings, family employment status and demographic characteristics, 
as well as price and income expectations." The gradual development 
of such a program would remove the considerable inconsistencies 
which now develop when the model builder incorporates measurements 
for many of these factors from fhe variety of surveys which now pro- 
vide them. ОА 

A program in this field must provide reliable data, and if possible 
provide measurements which reflect variations in consumer behavior 
as variations in economic activity occur. In practice this would mean 
securing monthly reports from consumers on their economic activity in 
the previous month ог week.‘® Such reports would facilitate accurate 
reporting by consumers, since the memory feats required would be 
minimized. No less important is the likelihood that their frequency 
would give us an improved knowledge of congumer activity. For ex- 
ample, interviews for the 1948 Survey of Consumer Finances were con- 
ducted during January and February of 1948. Some 50 percent of the 
respondent interviewed in January expected a pfice rise. The grain 
market broke in February, and only 15 percent of those“interviewed 
after the break expected prices to zise.? For the model builder the in- 
creasing number of observations at different levels of the nation’s 


p 

41 This proposal was discussed at greater length in a paper prepared for the 1949 annual meeting 
of the American Statistical Association, “The Validity of Interviews: Consumer Expenditure Surveys.” 
To minimize respondent burden and nonresponse a system of replicated designs would be necessary: 
broad information would be taken from the main sample and subsidiary details from related samples. 

Small scale studies by the Bureau of Human Nutrition and Home Economics suggest that some of 
the advantages of replicated samples may be delusive but that the procedure is practicable, Cf. Barbara 
B. Reagan and Evelyn Grossman, Rural Levels of Living in Lee and Jones Counties, Misstssippi, 1946, 
and a comparison of two methods of data colisction (October 1951) USDA Agriculture Information Bul- 
letin 41, esp. Part 2. e " 

48 More than one period may be used, since reporting for long periods may provide reliable data 
for some classes ША кел (e.g. cars purchased), whereas briefer periods*may be necessary for ther 
items (cigarettes, milk, eto.). 

4 Federal Reserve Board of Governors, #948 Sursey of Consumer Finances, Part I, Table 7. The 
relative change for farm operators and other groups as a whole was almost identical. Ct. James C. 
Davies, “Some Relations Betwech Events and Attitudes" The American Political Science Review 
(September 1952) p. 780 © a 


» 


226 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


economie activity should increase the number of variables which he 
can include in his model, or test with any conclusiveness.5? 

Children's stories used to end with the magic words, “and they lived 
happily ever after." The prospect for econometricians, unhappily, is 
quite the reverse. Data will continue to be inadequate. What concepts 
lie behind the statistical measurements and what errors are hidden 
within them will continue to be issues for exploration. Econometricians 
will increasingly have to delay their more fascinating analytic work in 
order to ponder on the data and results, makiag certain that. their 
findings do not merely quote errors, or assumptions inherent in the 
original measurements. They will in all probability find it essential to 
confirm each model with check models, making use of whatever scraps 
of evaluative information are offered them by data producers, In all 
this, however, there is at least one measure of consolation—the increas- 
ing frequency with which data producers feel obliged to check their 
data against control figures, to conduct post enumeration surveys, 
and to rely on scientific sampling so that at least measures of sampling 
errors will be available for data users. For both the producers of data 
and those who use them in economic models have a similar goal—to 
improve our understanding of economic change. 


5% Tt is, of course, possible to multiply observations even more cheaply, by a process of interpola- 
tion. This procedure produces useful data for some purposes. However, it does not really add to the 
number of independent observations of change through time. For a contrary view, in practice, Cf. 
Colin Clark's use of Barger's interpolated quarterly data in his *A System of. Equations Explaining the 
United States Trade Cycle 1921 % 1941," Econometrica (April 1949). 


D д 3 


TECHNICAL ASPECTS OF TRANSPORTATION FLOW DATA 


R. Tynes Surrg, ПІ 


Bureau of Transport Economics and Statistics 
Interstate Commerce Commission* 


HIS paper discusses the technical aspects of two problems relating 
to the 1 per cent sample of rail carload waybills currently being 
secured by the Interstate Commerce Commission. One concerns the 
selection of the sample itself while the other involves the use of the 
sample information for estimating a desired transportation statistic. 
It is proposed to show that the present waybill sample is a simple 
yet powerful tool for transportation analysis. After a short historical 
“review of the background of the problem in general, the sample selection 
procedure is described. This yields adequate but biased results so an 
additional adjustment is made which eliminates this bias and gives an 
efficient and representative sample. After thie description an example 
of the straight forward manner in which the waybill data may be used 
to get answers to complex transportation problems is given by describ- 
ing the technique used to develop a series of rate indexes. Finally 
several methods by which the standard deviations of these estimated 
indexes were actually determined will be outlined. But before proceed- 
ing a brief description of a waybill may be helpful for those readers 
who are not transportation experts. Ы 
A bill of lading is first prepared by the shipper which tells the carrier, 
among other things, the kind and amount of commodity to be shipped, 
and the points of origin and destination. The carrier prepares a waybill 
from this document which accompanies the shipment and which also 
provides the basis for assessing the transportation charges. After the 
shipment is delivered, the waybill if audited by the terminating carrier 
to correct the charges and is then filed. This audited document, there- 
fore, provides a record of the actual transportation services performed 
and the charges made for that service. A copy is illustrated on page 238. 
The use of waybill data in the analysis of transportation problems is 
not new. As a matter of fact, results of waybill studies have been used 
in proceedings before the Commission for at least fifty years. The first 
nationwide study was made in 1932, and the present waybilf sample 
currently being secured by the Commission represents the first attempt 
to get continuing information in this manner. There has been а gradual 
evolution Ьо in the type of data collected and in the techniqué’ of 
collection during this period. © 


* This paper has not bech eBnsidered by the Commission. 
: r 


221 
LU 


e 


228 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1 


The first individual studies were made to develop specific information 
and often covered only the movement of single tommodities or only 
the traffic between particular points. There was no sampling problem _ 
because all of the bills for all of the traffic involved in a specific period 
of time were selected. The data were usually presented simply as ob- 
served facts for that period. à kr 

The first nationwide study was made by the Federal Coordinator of 
Transportation based on waybills covering one day's terminations. in 
the year 1932. However, it was clearly recognized at that time that ti B 
sample could not be considered as a representative one so only limited | 
generalizations were made from the relationships developed. The mate- 
rial was tabulated and no attempt was made to expand it as an estimate — 
of the total traffic for a longer period than the actual day covered, 5 

The first major attempt to obtain a countrywide representative - 
sample for a whole year was made by the Board of Investigation and 
Research. Each Class I rdilroad was requested by that Board to supply — 
it with certain information for all carload traffic terminated on its line A 
on one designated day in each month of the year 1939. Different days 
in the month were assigned to the railroads in any one region so as to. 
allow a better sampling of brief seasonal movements and the dates for — 
parallel roads were staggered. Staggering of dates was arranged so far _ 
as possible to catch every week day during the week twice during | 
year. This design was a major advance because the resulting sample — 
could be &onsidered as representative of the total traffic for the whole — 
year. } 

Several waybill studies were conducted by defense, agencies during — 
the war. The War Department started a continuous sample of the bills ` 
of lading issued for its commercial traffic within the United States. This 1 
Was а systematic sample which at vhe beginning was secured by simply - 
counting the documents and making a selection in the proportion de- E 
sired, Refinements were made in this procedure from time to time апд 
a more efficient method was finally developed making use of the ter- 
minating digits of the bill of lading number. 3 

It was found that а number of important characteristics such as type _ 
of commodity were releted to the originating point and so stratification — 
by origin prior to sampling greatly increased the efficiency of the sam- 
ple. ‘The bills of lading were prenumbered serially in large blocks and 
furnished in groups to the issuing office so that the desired stratification _ 
could be accomplished by arranging the bills according to their num- - 
bers. Analysis of the systematic seiection of these bills led to the ` 
choice of a combination of the terminating digits as a selection device _ 


TRANSPORTATION FLOW DATA bs 229 


which made the initial arrangement of the bills by number unnecessary 
but which preserved the advantages of origin stratification. 

In 1946 the Commission started work on its waybill sample design 
and, of course, gave consideration to past experience in this field. Two 
practical methods of selection had been developed, the first based on 
all traffic terminating on certain days, and the second based upon ter- 
minating digits in the waybill number. Each of these plans offered 
peculiar advantages and disadvantages. 

The selection of all'traffic terminating on specified days was favored 
by many ef the carriers because of its apparent simplicity. They also 
felt that there was some advantage in doing the necessary work in- 
volved in a short period of time in order to get it over with. However, 

"there were some serious objections to this type of sampling, especially 
in connection with the ability to secure adequate representation of short 
seasonal movements and with the difficulty of determining the possible 
sampling error involved in estimates preparedsfrom the sample. 

The difficulty in determining standard errors of estimates based on 
solid days studies lies not in the fact that these are cluster samples, but 
that the clusters (days) are usually chosen on & judgment rather 
than a probability basis. Even if a suitable method of probability 
selection were applied the efficiency would be low because the seasonal 
factor of much of the traffic makes the between days variance of many 
important characteristics quite high. Since practical sample sizes lie 
between four and twelve days for a year, it is evident thai estimates 
would necessarily be subject to relatively large standard errors. The 
author was recently told of a сазе where a’ sample consisting of four 
days, one in each quarter, was found to contain no observations ofan 
important commodity known to be moving in large volume. A check 
revealed that there actually had been no movement on the selected 
days due tö chahce circumstaiices such as strikes and floods. 

An alternative method of semple selection is based upon the ter- 
minating digits of the waybill number. This was à new idea to most of 
the carriers and many of them felt that it would present а more difficult 
problem than would the solid day's sample. It was also determined that 
there were a number of railroads using monthly numbering systems 
which would produce a bias in this type of selection if a standárd com- 
bination of terminating digits were used for selection. Balancing these 
objections were the very important considerations of greatly increased 
efficiency in thè sample, including the ability to reflect very small sea- 
sonal changes, and the possibility of making reasonably accurate 
estimates of the sampling errdrs of sample results. After careful con- 


230 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


sideration of these and other factors it was decided to base the sample 
selection on the terminating digits of the waybill number. 

A waybill sample could be selected either at the point of origin or at 
destination but the terminated bill is preferred because it contains 
complete information regarding the actual movement of the shipment 
and audited results of the charges assessed. However, this choice pre- 
sents a problem when the selection is based on waybill numbers because 
_ these numbers are prepared at the origin points., There are several 
common methods of numbering waybills and thé possible effect on the 
sample of these methods must be considered in the sample design. 

All waybill series are numbered progressively to some point and then 
start over again at the initial number. The end of the series may be de- 
termined by some period of time, such as the end of a month, or by“ 
some number in the series itself such as the waybill number 10,000, 
100,000, and so on. This latter type of series may be called a block 
system and it can easily; be seen that the selection of all waybills with 
numbers ending in any given pair of digits from such a series will yield 
а systematic and unbiased one per cent sample of the waybills if the 
length of the block is some multiple of 100. Also, since it appears that 
the traffic characteristics of a shipment are independent of the ter- 
minating digits of the waybill number on which it moves, an unbiased 
sample of waybills based on these digits will also be an unbiased and 
representative sample of the shipments, It is immaterial in this case 
whether tite selection is made at origin or destination. 

The practice of starting waybill numbering series over again at the 
end of a period of time, such as eachymonth or year, presents a dif- 
ferent problem. It does then make a substantial difference whether the 
sample selection is made at origin or destination. An example of the 
effect at several small stations nümbering their bills on a monthly 
basis will illustrate. > x1» 

Suppose a sample of one out a hundred waybills is desired from a 
group of stations each issuing less than a hundred waybills per month. 
If the sample is to be selected at the origin station, there are a number 
of acceptable devices which could be used. A simple and direct ap- 
proach would be to establish a Separate register for checking off the 
bills as they were issued which would indicate each hundredth one for 
selection. This method, of course, is independent of the type of number- 
Ing system or of the volume of bills issued. Other methods can be de- 
vised using the waybill number but they immediately run into the 
difficulty that this number alone is no: sufficient to make a one per cent 
selection. There will only be as many different, numbers as there are 


ex 


TRANSPORTATION FLOW DATA 231 


bills issued each month so that in the case of small stations some 
method of compounding must be used if a one per cent probability of 
selection is required. It is evident that this compounding device must 
be based on the volume of traffic at the issuing station and the waybill 
number initially used cannot be greater than the total issued in any 
one month. $ 

These considerations, and the fact that the selection was to be made 
by the terminating carrier, dictated the choice of waybill numbers “1” 
or ending in the digits«“01” as the best selection device. This permitted 
the issuance of standard, simple, and unambiguous instructions which 
are so necessary for sucéessful operation, The compounding device re- 
quired to secure a one per cent selection from the small stations was a 
"procedure established at the Commission for subsampling the “1” : 
bills from these small stations with a probability proportional to the 
average number of bills issued per month over a recent 12-month 
period. This method is roughly equivalent to,weighting the received 
bills; the expected values will be the same although the variances will 
be somewhat higher because of the discarded observations. The dif- 
ference is relatively small and is more than compensated for by in- 
creased simplicity in subsequent operations. 

The excess bills over the required one per cent, which result from 
monthly and annual,numbering systems, are an unnecessary cost to 
both the carriers and the Commission and so the carriers have been 
urged to change to the more desirable block system. Many have done 
so and consequently the problems associated with establishment of a 
proper sample design for small gtations are becoming less important. 

After developing and applying a correct sample design, it is evident 
that a satisfactory sample depends upon the complete aril accurate 
selection of the designated waybills«These waybills are selected by the 
reporting carriers in over one hundred offices through the occasional 
efforts of perhaps a thousand individuals. This factor introduces pos- 
sibilities of error which require constant policing to assure the desired 
accuracy. Fortunately, the Commission receives another report on 
carload traffic which can be compared to the waybill sample as an 
effective check for completeness and accuracy. , 

This report is known as the quarterly freight commodity statistics 
and shows for each railroad tHe total number of cars terminated by 
that road for each of the 261 carload commodity classes divided be- 
tween traffic which originated on the reporting road and traffic which 
was originated on other carriers, and was delivered to the reporting 
road for termination, Each quarter the waybill sample is tabulated in 


9 a y 


G 


232 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


this same manner for each reporting carrier and the results, com- 
modity by commodity, are compared to 1 per,cent of that road's 
freight commodity statistics. Any significant discrepancies noted in 
this comparison are carefully reviewed and the causes for such dis- 
crepancies are determined. It has been interesting to note that some of 
the discrepancies have been due to errors in the commodity statistics 
rather than in the sample and so in these cases the sample has actually 
been used as a quality control for the 100 per cent report. An idea of 
the effectiveness of the control of the sample may be gained from the 
fact that in each quarter there are in the neighborhood of 20,000 indi- 
vidual comparisons. The sample itself includes about 75,000 observa- 
tions. 

The selection of waybills on the basis of the waybill number provides: 
in effect for a sample of every hundredth bill issued by each station. 
This is actually true in the case of stations issuing bills on a block sys- 
tem and in the case of stations issuing many hundred bills per month 
and is approximated at the others. Consequently the sample is quite 
efficient because of the relationship between the originating station 
and important transportation factors such as the commodity and ter- 
ritory. The periodic comparison of the sample to control totals helps 
reduce selection errors and assures its validity. The sample has found 
many uses in varied fields and its application to one transportation prob- 
lem will be discussed, but before doing so it might. be well to consider 
the foregoing procedures in the light of modern sampling theory [1]. 

The universe considered here is the totality of rail carload shipments 
terminated by Class I raiiroads in the United States. The sampling 
unit is the waybiil, which in general covers a single carload, and the 
frame is tie complete series of numbered waybills. These numbers are 
assigned at the origin station so that when a block numbering system 
is used selection based on the terminating digits “01” will provide a 
systematic sample of every hundredth waybill issued, stratified by 
origin. Since the shipment itself is independent of the terminating 
digits of the waybill number this will result in an unbiased, random— 
and therefore representative—one per cent sample of carload ship- 
ments. 

A modification of this design is necessary in the case of small stations 
numbering their bills on a monthly sysem. Where at least one but less 
than 100 bills are issued per month the selection of the “1” bills provide 
а sample consisting of the first shipment in the month for each station. 
This, therefore, is not necessarily a random selection except in the sense 
that there is some chance element in whether or not a particular ship- 


TRANSPORTATION FLOW DATA 233 


ment will be the first one. The essential random element is provided by 
a further process of sabsampling these initially selected bills. 

The “1” bills from each of the small stations over several months 
form a series which approximates a systematic sample with a sampling 
interval equal to the average number of bills issued per month. The 
desired interval is 100 and this in turn can be approximated by discard- 
ing a sufficient number of bills so that the resulting average interval is 
100. The bills to be discarded are selected by a chance process with 
appropriate probabilities so the final sample is a random one. This per- 
mits valid estimates of standard errors, an example of which is given 
later. ^ e 5 

There is a problem of non-response due usually to occasional failure 
to select all the required “1” and “01” bills. These failures are essen- 
tially random in character and do not appreciably affect computed 
averages or ratios although of course they do introduce a negative 
bias in estimates of aggregates. The quarterly somparison to the 100 
per cent count of the commodity statistics report provides an excellent 
control and good progress is being made in the establishment of ac- 
eurate mechanical checks by the reporting carriers which reduce the 
non-response error. 

Many diverse uses have been made of the Commission's waybill 
sample but it is proposed to examine the technical aspects of its use 
in only one problem. This problems, however; is one of sufficient com- 
plexity to indicate the value of the sample as a research tool.* 

The need for an index which would measure changes in rail carload 
freight revenue resulting solely from increases or decreases in freight 
rates has been recognized for years. Experimental indexes for various 
commodities were developed by the Commission but the methods used 
were too costly to permit any extensive expansion of these series. The 
Commission's 1 per cent waybi:l sample provided a new approach to 
the problem and offered the hope that annual indexes of many freight 
rates could be prepared and kept current with a relatively small ex- 
penditure of time and funds. The method finally adopted uses, with 
minor modifications, information contained in the regular releases of 
waybill statistics. i 

The regular Commission waybill statistics are tabulations of punch 
cards prepared from the waybili samples reported by the terminating 
carriers, The illustration on page 238 shows facsimiles of a typical way- 
bill report and the detail and waybill cards prepared from that reporv. 
It also gives a brief description of Һе processing of the bills and cards. 

Mileage block statistics have been developed by classifying the 


234 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


traffic according to traffic categories determined by the commodity 
class, short-line length of haul, type of rate, and*territorial movement. 
These are the statistics used in the preparation of the freight rate in- 
dexes. It should be noted that the rates, or prices charged for trans- 
portion, are related to each of these factors and they were chosen with 
a view towards making the items within each resulting traffic category 
as homogeneous as possible. The sample shipments for each year fall 
into about 30,000 of these traffic categories, 

The indexes are constructed by comparing the tonnage and average 
revenue per ton of matching categories in the base and comparing year. 
Two sets of revenue figures are prepared from these data. The first 
consists of the actual revenue for the base year which, of course, is 
simply the total of the revenues for each of the comparing categories 
in this year. A second set of revenue figures are obtained by applying 
the average rate per ton for each category in the comparing year to the 
tonnage which moved in the same category for the base year. This 
produces a figure which represents the revenue which the traffic moving 
in the base year would have produced at the average rates in effect 
in the comparing year. The ratio of these two revenues yields the desired 
index figure. 

It is evident that factors other than changes in freight rates might 
affect the average revenue per ton for a given traffic category. For 
example, the average letigth of haul within the mileage block assigned 
to the traffic category might be greater or less in the comparing year 
than in the base year. Similarly, the specific commodities which moved 
within the given commollity class might be different and take different 
tates. Therefore, the average revenue effect of such differences must be 
small as compared to the effect of rate changes if the resulting indexes 
are to be a satisfactory measure"of changes in rates. Careful study of 
this problem does indicate that this requirement is substantially satis- 
field. As noted before, the traffic categories were chosen so as to be as 
nearly homogeneous with respect to rate characteristics as practicable. 
The milage blocks are shorter where changes in rate progression are 
most rapid. The various territorial movements are kept separate as 
are movements on interstate and intrastate rates and there is a further 
classification by commodity class and type of rate. Consequently, the 
area of fluctuation in average revenue from causes other than changes 
in the rate is relatively small in each of the traffic categories. 

It is reasonable to expect that even when such fluctuations as these 
do occur, they are as likely to affect the results in one direction as 
another so that there is a tendency for the errors to compensate rather 
than to accumalate. In consequence, the net effect is that the changes in 


2 


TRANSPORTATION FLOW DATA 235 


average revenues noted by this method will usually reflect quite closely 
the revenue effects of actual changes in rates. Therefore, the indexes 
will be a measure of the average changes in rates. An illustrative page 
from the published statement is shown in Table 1. 

These indexes are based upon only one of the 100 similar 1 per cent 
samples which could have been selected from the waybills representing 
the total carload traffic. Presumably indexes computed in exactly the 
same manner from each of the 100 different samples would have pro- 
duced 100 slightly different results. Each of these in turn would have 
been an estimate of the index which could theoretically have been pro- 
duced from à 100 per cerit sample consisting of all of the waybills, This 
condition, of course, immediately raises the question of how much 
does a particular index computed from the sample which actually was 
selected differ from the theoretical index which could have been pre- 
pared from the total of all waybills. 

There are a number of methods by which an'estimate of the standard 
deviation of any computed index could be obtained but in particular 
there are two relatively simple procedures which were actually applied 
to the computed rate indexes and which have been used for similar de- 
terminations of the confidence which could be placed in other estimates 
prepared from the waybill sample. 

Tt was noted that you would expect somewhat different results from 
computations made on each of the 100 possible 1 per cent samples which 
could have been drawn from the total of all waybills. If the variation in 
these estimates could be known, it would be easy to calculate the stand- 
ard deviation for cone of them. Obviously, it ic not possible to get such 
figures as these put an extension of the reasoning can be made to the 
sample which is available. If this sample is divided up into, say, ten 
parts in a manner similar to the method used in the selection of the 
original sample, then the variation observed in separate estimates made 
from each of these ten parts can be used to estimate the possible varia- 
tion in the estimate prepared from the whole sample [1]. If X, · · · Xio 
are the ten subsample estimates then the variance of the total sample 
estimate X is approximately: 


1 1 10 eu 
S? = — — X;— X 
й 052 | 


or, using the range R=Xmaz—Xmin the standard ‘deviation can be 
estimated from the relationship [2] 


236 


N TABLE 1 


INDEXES OF AVERAGE FREIGHT RATES FOR COMMODITY 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 q 
‘GROUPS AND SELECTED COMMODITY CLASSES* | 


(1950 =100) 
Рег cent 
Lidex Tnerease APPIOX. 
Item Sia Devia- 
1947 1948 1949 1950 1951 1952 Over ` 
tion 
i 1947 
- y 
AU Соттофнев................... - 80 93 99 100 102 109 36 5 
Gróup I—Products of Agriculture. 80 93 98 100 102 108 35 5 
/ Class 001 Wheat. 80 93 95 100 102 108 35 1.0 
003 Corn... 75 94 95 100 102 109 45 1.0 
038 Cotton in Bales. 82 92 98 100 102 112 37 1:08 
t Oil Bearing Crops. 78 91 99 100 103 11 42 4.0 
t Fresh Fruits... 83 95 99 100 101 106 28 5 
f Fresh Vegetables - 82 95 99 100 101 106 29 5 
085 Potatoes, Other Than:Sweet 79 94 99 100 101 107 35 1.0 
101 Sugar Beets... 86 96 103 100 106 110 28 3.0 
199 Products of 
NOB Gren een, 88 94 105 100 101 116 32 3.0 
Group II—Animale and Products... 77 93 99 100 102 110 43 5 
Class 203 Cattle and Calves, S.D. 79 9 98 100 103 112 42 1.0 
215 Meats, fresh, N.O.S.. 78 90 98 100 103 10 51 1.0 
Group III—Products of Mines... 83 91 98 100 102 108 30 5 
Clase 301 Anthracite Coal, N.O, 81 90 99 100 100 10 931 5 
305 Bituminous Coal. , 81 89 98 100 102 107 32 5 
307 Coke....... 2 83 91 97 100. 103 108 30 1.5 
309 Iron Сте... 88 95 99 100 103 110 25 5 
328 Clay and Bentonite 84 95 98 100 104 11 32 1.0 
825 Sand, Industrial... 7 91. 98 100 105 113 47 1.0 
827 Gravel and Send, N.O.8.... 90 98° 102 10 102 -107 19 1.0 
829 Stone and Rock: Broken, 
"Ground and Crushed. .... 85 93 100 100 103 108 27 1.0 
831 Fluxing Stone & Raw Dolo- 
mite. +» 75° 88 98 100 104 109 45 


387 Petroleum, Crude 
339 Asphalt. 


^o ooo 


341 Salt.. 
343 Phosphate Rock 81 95 100 100 104 110 36 
399 Products of Mines, N.O. 8&2 98 100 100 103 112 37 
79 93 98 100 102 109 38 5 
: 8 93 99 100 103 108 30 1.5 
g, 
a Wooden. + 7 92 97 100 100 11 41 1.5 
409 Pulpwood... - 88 98 100 100 104 110 25 “5 
422 Lumber, Shingles, and Lath, 78 92 оз 100 102. 109 49 5 
499 Products of Forests, N.OS.. 83 95 100 100 104 112 35 2.5 


* Only classes having more than approxima’ 1,000 cars i ё tely. 
"The group idles x me 5 тв in the sample are shown separately. 
‚1 These groupings of individual commodity classes are shown at the request of the Department of 
Agriculture. “Oil bearing crops” include commodity classes 037"tc,047, 097 and 105; “Fresh fruits", 
near through 0^9; “Fresh vegetables” classes 077 through 089 (including 085 which is also shown 
separa! . E 


7 


t 


TRANSPORTATION FLOW DATA 237 


"This latter estimate is only about 85 per cent as efficient as the former 
but has the great advfntage of simplicity and ease of computation. 
The first estimates of the standard deviation of the rate indexes 
were made by following this procedure in principle. It wasrecognized, 
however, that the preparation of ten separate sets of indexes from ten 
subsamples would be too grêat a task for the limited staff available. 
Consequently, the individual waybills in the sample were divided into 
only five groups. This division was made as nearly as possible on the 


_ same basis as that by Which the initial sample was selected. Hach of 


these five groups was classified according to traffic categories and five 
different sets of indexes were computed. The variation in these indexes 
then yielded estimates of the standard deviation for the indexes pre-. 
pared from the total sample. This method, however, proved to be an 
even greater task than initially cantemplated and a less accurate but 
substantially less costly procedure was developed. 

The second procedure made use of the same mfleage block cards that 
were used in developing the initial indexes. These cards were randomly 
distributed into ten groups and the indexes prepared as before for each 
of the groups. The variation in these indexes then yielded an estimate 
of the standard deviation of the index prepared from the total of the 
mileage block cards. While this procedure is not as accurate as the one 
using the individual waybill cards, the error is on the safe side because 


| there is a bias which tends to indicate a largér value of the standard 


deviation. However, the substantial savings resulting from the use of 
mileage block cards instead of the individual waybill cards more than 
Offsets the loss in accuraoy. М $ 5 

Incidentally, the mile block cards were randomly distributed into ten 
Subgroups by means of a one column sort on the units position of the 
revenue field. The digits in this field are equivalent to the sum modulo 
ten of the units digits in the waybill cards included in each block. It has 
been shown [3] that such a process of ^compound randomization" will 
yield Sequences approaching equal probabilities for each digit, even 
When the generating sequences are quite biased. Consequently it is 
permissible to use aggregate fields of summary cardsin this manner even 
When the same fields in the detail cards would be unsatisfactory. Ex- 
tensive tests have been made on,a table of random digits [4] produced 
in a similar manner from waybill data which indicate that considerable 
confidence can be placed in the effective randomness бї the results. 

The results of the two procedures for estimating the index standard 
deviations were in substantial agreement although individual examples 
of wide variation did appéar. These were usually due to unequal dis- 
tributions of traffic in the mileage block cards which of course is one 


/ 
SION 
STATISTICS 


INTERSTATE COMMERCE COMMIS! 


BUREAU OF TRANSPORT ECONOMICS AND 


CONTINUOUS CARLOAD WAYBILL STUDY 
CARD FORMS AND PROCEDURE 


DETAIL CARD 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


il 


id 


k TRANSPORTATION FLOW DATA 239 


of the prices which must be paid for using this approximation method. 
However, as was expetted, the discrepancies were usually on the side of 
overestimating the error and so the less costly procedure is considered 
a safe one for determining the confidence which can be placed in the 
estimates. 

The preceding discussion lias covered the selection and adjustment 


-of the carload waybill sample, and the method used to check its com- 
:pleteness and accuracy by comparison with the carriers’ freight com- 


modity statistics report? It has shown how the regularly published way- 
bill statistics can be used, with minor modifications, to get the answer 
to a complex transportation problem in a simple and straight forward 
manner. As a final example several methods for estimating the standard 
deviations of sample results were developed. In each case the tech- 


- niques used were simple and direct, This is characteristic of the current 


waybill sample and illustrates its value as a tool for transportation 
analysis. "e 


REFERENCES 


_ [1] Deming, William Edwards, Some Theory of Sampling, New York, John Wiley 


& Sons, Inc., (1950). 

[2] Dixon, Wilfrid J., and Massey, Frank J., Jr., Introduction to Statistical 
Analysis, New York, MeGraw-Hill Book Co., Inc., (1951). 

[3] Horton, H. B., and Smith, R. T., “A direct method for producing random 
digits," Annals of Mathematical Statistics, 20 (1949), 82-90. 


_ [4] Horton, H. B., and Smith, R. T., Table of 105,000 Random Decimil Digits, 


Washington, D. C., Interstate Commerce Commission, (1949). 
e 


e 
* 
LI r е 


RESPONSE ERRORS IN THE COLLECTION OF WAGE | 
STATISTICS BY MAIL QUESTIONNAIRE р 


SAMUEL Е. COHEN AND BENJAMIN LIPSTEIN 
U. S. Bureau of Labor Statistics 


HE literature on response errors in mail questionnaires has for th 

most part been devoted to the personal characteristics of the 
dividual, e.g. age, education, attitude toward social questions, e 
Little attention appears to have been given to response errors where 
the data are collected from firms or organizations. Tt is doubtful, hoy 
ever, that a sharp separation can be made between response errors 
personal characteristics and data on business operations. In the small _ 
firm, elements of prestige may exist with respect to exaggerating sales, _ 
production, or employment. The large firm on the other hand may be 
reluctant to disclose the extent of its operations for competitive 
sons, or in government surveys may fear that the information may b 
used for other than statistical purposes. Further, during periods o 
economie control, some business firms may consider it advantageous 
bias reports in their own favor for future administrative action. Thus, i 
appears that a parallel may exist between response biases of individuals” 
and the response bias of firms. 

The probem errors of response of firms to a mail questionnaire 
recently studied in a wage survey of the rainwear industry conducted: 
by the U. S. Bureau of Labor Statistics. Тус kinds.of response errors. 
were exemined, 1) coverage errors and 2) errors in the reporting of 
wage data. Coverage error as part of response error exists in this i 
stance because the firm classifies itself as being in or outside of the 
dustry. It includes firms that were incorrectly classified аз being in the 
industry as well as those which were incorrectly excluded. 

Errors in reporting of wage data include such errors. as inco 
reporting of wage rates, inclusion of workers outside the scope of 
vey (e.g.—office workers), and erroneous division into the auxili 
or non-auxiliary class. 

The survey was initially made during July 1951 for a June 1951 pa: b 
roll period, to provide data for aiding in the determination of the pre- 


ERRORS IN MAIL QUESTIONNAIRES 241 


vailing minimum wage in that industry.? Responses were obtained pre- 
dominantly by mail questionnaire. A subsample of the non-respondents 
to the mail questionnaire was contacted in person or by telephone so 
that an unbiased estimate of the wage distribution of the industry 
could be prepared. Subsequently, questions were raised as to whether 
mail collection was appropridte, and whether the data presented were 
acceptable for use in minimum wage determination. Accordingly, a 
field resurvey was made to check the survey results and to determine 
whether mail questionnaires are feasible in surveys of such industries 
as rainwear, where piece-work payments prevail. 

The original respondents were given no advance warning of the re- 
survey. Difficulties generally encountered in studies of response error 
because of uncertainty in identifying the reporting unit were not 
present in this survey. It was possible to identify the reporting units 
as being identical in both surveys—something not always possible with 
families or households. & 

The variables measured in both the survey and resurvey were sub- 
ject to objective determination from payroll records. Thus variations 
in reporting would be attributed to indifference, misinterpretation of 
instructions, arithmetic errors or purposeful concealment. These items 
cannot be isolated in most studies of survey techniques and reporting 
practices. For example, differences in reporting educational attain- 
ment can be attributed in part to the uncertainty of the definition. A 
very common source of variation in reporting, age, is generally not sub- 
ject to any definite verification. All that is available as a rule are two 
replies by the same respondent, both of which may be,in error. 


SURVEY METHODS 


The universe of establishments in е rainwear industry was de- 
fined to inclutle all establishments whose value of product of rainwear 
Was 50 per cent or more of their total production in 1950 or who main- 
tain separate rainwear departments. An industry of this type is not 
easily defined or isolated from the remainder of the needle trade indus- 
tries. A comparatively small number of firms are continuously in the 
industry from year to year. For the most part, firms shift in and out 
of this industry with the fluctuations in product demand. The plant 
шш Uey with the Auctugions ШШ ну нак 


* The Walsh-Healey Act specifies that emy engaged in government contracts exceeding 
$10,000 must pay the Preyailing minimum rate pre such work, The Secretary of Labor is assigned» 
Tesronsibility of determining the prevailing rate of pay in question. Administratively, this work is 
out by the Wage and Hour and Public Cdtracts Division of the Department of Labor. The 
The da of Labor Statistics is frequently called upon to conduct surveys of wage rates in the industry. 
fa collected in these surveys dto used as an aid in this wage determination. | 
АЕ 


е 


242 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


equipment of firms producing rainwear is not specialized, but rather 
is equally usable for many other needle trade products. 

'The initial listing of firms comprising the industry was obtained 
from Unemployment Compensation listings, supplemented by lists of 
firms provided by the International Ladies’ Garment Workers’ Union 
and the Amalgamated Clothing Workers Union? The first wave of 
questionnaires was mailed to all listed firms (633). A second wave of 
schedules was sent to all nonrespondents. The 353 responses to these 
two mailings consisted of 84 firms being defined as within scope (firms 
within the rainwear industry) and 269 firms as being out of scope. Of 
the nonrespondent group, a sample of 70 firms was selected for direct 
contact by telephone or personal visit. The non-response sample pro- 
duced 19 firms within the scope of the survey and 51 out of scope. ' 


TABLE 1 
SAMPLING PROCEDURE OF ORIGINAL SURVEY 


Stages in the Sampling Number of Out of 
Procedure Schedules In Scope Scope 


Total mailed 


Responded.......... EN 84* 209 
Did not respond. .... a 280 

Direct contact... p^ 70. 19f 51 

Total гөсеїўей................ 423 103 320 


* 66 schedules usable—rejected schedules failed to provide wage distributions or other essential 
data. 2 
1 All schedules usable. ? 


The fesurvey consisted of two parts, 1) a sample check of firms re- 
ported initially as out of scopeyand 2) the collection by personal visit 
of data initially obtained by mail. „ P^ 4 

Part (1) of this investigation consisted of a sample of 42 firms in 
New York City classified as out of Scope on the basis of mail returns 
in the initial survey. These 42 firms were rechecked by personal visit. 
Of 17 firms initially reported as out of business, of the sample of 42, it 
was found that 10 were completely out of business, one was manu- 
factaring another product, 4 were defunct rainwear subsidiaries of 
firms still in other lines, and 2 were n the rainwear business elsewhere 


and under different names (included in the universe under the new 
name). 


* Past experience has indicated that lists con.piled in this manner contain most of the firms in the 
industry. An exhaustive investigation of the completeness of the list would be financially prohibitive, 
as it would involve a detailed investigation of a number of relaved industries. 


» 


D 


Ae 


E ы с> ___ 


ERRORS IN MAIL QUESTIONNAIRES 243 


Of the 25 remaining firms in the out of scope sample—only 3 could be 
classified properly in fhe rainwear industry as defined. 

On the basis of this sample survey, it appears that the Bureau under- 
estimated the number of rainwear establishments in New York City 
by about 15 or approximately 20 per cent. 

Most of these 42 establishnients at some previous time manufactured 
rainwear. However, a general decline in the volume of such production 
had resulted in rainwear no longer being the principal product or the 


TABLE 2 


RESULTS OF RESURVEY OF FIRMS CLASSIFIED AS OUT 
OF SCOPE IN ORIGINAL SURVEY, NEW YORK CITY 


= 
Number of Firms 
. 
Classification iginal 
egy Resurvey 
Total Sample of Firms 42 42 
Out'of Business. s.s.. е 17 10 
Change of product. ...... sse... akii renis т 
Discontinued rainwear 4ер&............... 4 
Operating, under different names: 
Included in ппїүегве................... . 2 
e 
Manufacturing other products.......-...- 25° e 22 
Manufacturing гаїпуеаг................. ° 3 
e 
* 
Total out of scope....... UU a ^ 39 


discontinuance of a separate rainwear partment. Such shifts had been 
taking place gince“1946 and explain in part the divergence of the survey 
data with respect to number of establishments (1951) from that of the 
1947 Census of Manufactures. 

Despite the under-representation of the Middle Atlantic region (as 
suggested by the New York City results) in the Bureau of Labor Sta- 
tistics survey, no great difference in the Nationwide statistics could 
be generated by such a deficiengy. Under the extreme assumpfion— 
that the bias of omission occurs in that area alone—the increase of the 
Middle Atlantic employment raises the Nation-wide median by 1 cent 
and decreases thé percentages of workers in the lowest class intervals 
(75 cents-80 cents) by no more tha? three-tenths of 1 per cent. Actually, 


n understatement of 20.pér cent^in number of firms in New York City 


€ 
eet 


244 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


and in turn in the Middle Atlantic Region means less than a 20 per cent 


understatement in employment since firms in this region are smaller 


than those in other regions. Such field work as was done in New York . 


is very expensive to do elsewhere because of the scatter of the remain- 
ing establishments. A small sample in other areas would be practically 
useless because of the apparently low proportion of bona fide rainwear 
firms erroneously excluded. 

Do errors of the universe, such as we have just described, form an 
effective argument against making mail Surveys in such industries as 
rainwear, where the establishment turn-over is so great? Without the 
expenditure of large sums, it is likely that the error in any field survey 
would be even greater. In order to obtain 85 usable schedules, it was 
necessary to make an initial mailing to more than 600 establishments, 
most of which did not fall within the scope of the survey. The field 
experience would not be appreciably better than this, judging from the 
canvass made in NewsYork. Such a canvass by personal visit would be 
considerably more expensive. Hence the alternatives presented are (1) 
a mail survey adjusted for non-response, or (2) no survey at all. The 
extent to which correction should be made depends on the use to be 
made of the survey. 

In the second part of the resurvey, an attempt was made to visit all 
firms not originally interviewed in person. However, visits were not 
made by the Bureau tu 13 establishments which were included in a 
field survey made by Dr. Lazare Teper of the International Ladies’ 
Garment Workers’ Union. Dr. Teper provided the Bureau with data 
collected in his survey.» ; 

Wage data collected in the resurvey were obtained directly from pay- 
roll records. Schedules were obtained from 53 of the 57 firms visited by 
by Bureau of Labor Statistics. One firm employing about 100 employees 
refused to cooperate in the resurvey. The other three; employing a total 
of 28 workers, were either out of business at the time of the resurvey, 
or their records were unavailable. 

The retabulations are thus based on 15 schedules obtained orig- 
inally by personal visit, 13 obtained by the ILGWU, and 53 obtained 
intheresurvey4  . 

Ali schedules were given the same weight in the retabulation as in 
the original survey. The final tabulations are therefore the equivalent 
of @ complete field Survey. 


,." Four of the original 19 schedules obtained ky direct contact as mentioned in Table 1 were ob- 
tained by telephone. These were included in the 53 fims resuryeyed, 


y» 


ъ 


^ 


ERRORS IN MAIL QUESTIONNAIRES 245 


COMPARISON OF RESULTS 


In comparing the'results of the two surveys, it is assumed through- 
out that the data collected by personal interview are without errors, In 
reality, it is never possible to obtain absolutely accurate data whether 
by field or by mail procedures. Errors and biases exist in all data in 
varying degrees. Considerifig the nature of the survey, it is reasonable 
to assume that the data collected by personal interview are freer from 
biases of reporting and random errors than those obtained by mail. 

Table 3 and Chart'1 show the comparative cumulative distribution 
of workers, by earnings obtained by the two methods of collection. 


TABLE 3 


° CUMULATIVE PERCENTAGE DISTRIBUTION OF PRODUCTION 
WORKERS IN THE RAINWEAR INDUSTRY BY STRAIGHT-TIME 
AVERAGE HOURLY EARNINGS, UNITED STATES AND SELECTED 

REGIONS, JUNE 1951, ORIGINAL AND RESURVEY 


Average Hourly earnings* epos BA 
(in cents) Original | Resurvey | Original | Resurvey 
Under 75.0... аре 1 0.4 e 0.7 
Under 80.0... 13.4 11.8 17.8 18.5. 
Under 85.0... 17.2 16.2 22.4 23.4 
Under 90.0... 23.7 22.6 27.8 28.4 
Under 95.0... 28.6 27.7 ° 32.3 32.9 
Under 100.0.. 34.0 84.1 35.4 36.3 
Under 105.0.. 42.2 D 43.5 43.6 
Under 110.0.. 48.1 47.0 | *48.0 48.2 
Under 115.0.. 55.0 54.1 54.0 53.1 
Under 120.0. . 59.2 58.5 58.0 57.6 
ООР ТЕО MD 63.6% 63.3 61.6 61.6 
Under 180.0..:.9 ETEA * 68.7 67.8 66.1 65.6 
Under 135.0.. 72.4 71.6 68.9 68.6 
Under 140.0.. 74.7 74.1 71.0 70.9 
Under 145.0.. 77.6 77.1 74.3 74.0 
Under 150.0.. 79.8 79.0 76.5 76.1 
Under 155.0.. 82.4 81.4 79.9 79.1 
Under 160.0.. 83.8 82.9, | 82.0, | 81.5 
Under 165.0.. 85.6 84.8 84.3 « 83.9 
Under 170.0.. e 86.7 86.0 85.6 85.4 
Under 175.0 87.8 87.3 87.0 87.0 
€ 
Number of worlfers........... 9,149 8,929 4,005 3,960 
Median rate $b.10 $1.12 $1.12 $1.12 


246 AMERICAN STATISTICAL ASSOCIATION J OURNAL, JUNE 1954 
TABLE 3— (Continued) 


> —_ 
5 Middle Atlantic Great Lakes 
Average hourly earnings* 
(in cents) Original | Resurvey | Original Resurvey 
Under 75.0.......... р 0.3 0.1 0.1 
Under 80.0.... 9.2 4.8 10.8 7А 
Under 85.0.. 12.4 9.0 11.4 9.0 
Under 90.0.. 17.0 13.9 21.5 19.5 
Under 95.0............ 21.5 19.3 27.2 25.4 
Under 100.0........... 24.8 24.0 38.5 38.6 
Under 105.0.. 34.5 32.1 45.6 45.1 
Under 110.0.. eu 39.0 37.0 55.1 53.4 
Under ИО О УУ o 43.5 42.2 65.2 65.7 
Undér 1200 45.4 44.0 72.2 72.6 
Under 125.0. . . 49.0. 48.0 79.0 79.4 
Under 130.0.. 55.5 53.1 83.9 84.1 
Under 135.0.. 60.7 58.6 87.7 87.8 
Under 140.0. . 63.6 61.6 89.8 89.9 
Under 145.0.. 66.2 64.8 92.6 92.8 
Under 150.0.. 68.8 66.3 94.8 94.6 
Under 155.0.. 71.5 68.8 95.7 95.6 
Under 160.0. . 5217979 69.9 96.3 96.3 
ане ORION ie vse cin 73.9 71.5 97.4 97.4 
Under 170.0 -| 75.6 73.3 97.8 97.7 
Underil75 0 ae 76.7 74.8 98.4 98.4 
Number of workers... . . EN АА 2,276 2,205 2,527 2,432 
Median rate............ $1.27 $1.27 $1.07 $1.08 


* Excludes premium pay for overtime and night work. 

T Includes data for other regions in addition to those shown separately, 

1 Less than .05 of 1 per cent. 
Beyond the 90-cent point there is little difference in any of the regional 
data, and even below this point there is little difference for the in- 
dustry as a whole. There were slight increases in the median rates, 2 
cents in the industry as a whole, 2 cents in the Middle Atlantic region, 
and 1 cent in the Great Lakes region. 

The lower end of a Wage distribution is of critical importance for 
minimum wage determination, Examination of these parts of the dis- 
tribution shows that the greatest difference occurs in the interval be- 
tween 75 cents and 80 cents, 11.4 per cent of the workers were classified 
in this interval in the Tesurvey as against 13.4 per cent inthe original 


A 


A 


ERRORS IN MAIL QUESTIONNAIRES 247 


regional distributions. In the Middle Atlantic region, the estimated 
percentages of workers between 85 cents and 90 cents were 9.2 per cent 
for the original survey and 4.5 per cent in the resurvey. In the Great 


CUMULATIVE PERCENTAGE DISTRIBUTION 


OF PRODUCTION WORKERS IN THE RAINWEAR INDUSTRY 


TN By Averáge Hourly Earnings, June 1951 MERGER 


90 
во 


то 


Г 
[7 ~—F\ELQ. SURVEY 


60 


40 


20 


0 
Under. фо 85 90 © 100 105 ПО но 190 H6 GO 126, мо 185 19 w5 160 165. 170 
o 


To 10 
So e$ 90 98 100 105 110 115 120 (25 130 135 Mo 145 180 185 160 165 09 15 
AVERAGE HURLY EARNINGS IN CENTS 
Р 
UNITED STATES DEPARTMENT OF LABOR ape press lad 


Cuart 1 ri - 
Lakes region the corresponding ‘percentages were 10.7 and 7.0. These 
ifferences are due primfrily tô the two firms reporting large blocks of 


Workers at the guaranteed rate a than at actu&l earnings, n 


248 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


two other questionable schedules with concentrations of workers at 75 
cents in the original survey. (Of these, one firm refused to cooperate in 
the resurvey, while the other firm supplied incorrect data in the 
original survey.) 

Practically no difference was found in the broad auxiliary and non- 
auxiliary classifications (See Table 4). It is noteworthy that small de- 
clines appeared in the earnings of auxiliary workers, although the 
median earnings of the non-auxiliary workers increased slightly. 


TABLE 4 
PERCENTAGE DISTRIBUTION OF AUXILIARY AND NON-AUXIL- 
IARY PLANT WORKERS (EXCLUDING LEARNERS) IN THE RAIN- 
WEAR INDUSTRY BY STRAIGHT-TIME AVERAGE HOURLY EARN- 
INGS,* FOR THE UNITED STATES, JUNE 1951, Е 
ORIGINAL AND RESURVEY 


А Auxiliary Non-Auxiliary 
Senet Workers Workers 
Average hourly earnings* 
(in cents) ioi izin: 
[л Resurvey ira Resurvey 

ОАЕ АО ЕЕЕ 0.1 0.3 = 0.4 
75.0 and under 80.0 36.3 41.8 9.6 7.7 
80.0 and under 85.0.. 10.5 10.7 3.1 3.6 
85.0 and under 90.0.. 11.9 14.7 5.8 5.4 
90.0 and*under 95.0. . 5.8 7.5 4.8 4.8 
95.0 and under 100.0 4.1 5.1 5.6 6.5 
100.0 and under 105.0 8.9 6.3 8.1 7-4 
105.0 and under 110.0 3.6 2.4 5.1 6.0 
110.0 and under 115.0 3.0 1.4 7.3 7.8 
115.0 and under 120.0. 2.7 3.9 4.4 4.5 
120.0 and under 125.0. Jel 2.5 4.8 5.0 
125.0 and under 130.0. 2.0 1:5 5.5 4.9 
180.0 and under 135.0. 1.8 .8 4.2 4.2 
185.0 and under 140.0. Ust 4 2.6 2.7 
140.0 and under 145.0 7 3 3.3 3.4 
145.0 and under 150.0... 8.2 .8 2.8 2.1 
150.0 and under 155.0... 1.0 — 2.8 2.7 
188.0 and under 160.0. E ;2, 1.5 1.7 
160.0 and under 165.0. 5 x: 2.0 2.1 
165.0 and under 170.0... sd tulum 1.8 1.4 
170.0 and under 175.0........ 2 — 1.2 1.5 
175.0 and over. .... SERRE х .6 e 13.7 14.2 
Total,...... 100.0 ‚| 100.0 100.0 100.0 
Number of workers -| 1,027 | 990 8,051 7,939 
Median гтайе............,..@.. $0.85 $0.84 « | $1.14 $1.15 


* Excludes premium pay for overtime and night work, 


ist 


ERRORS IN MAIL QUESTIONNAIRES 249 
REPORTING ERRORS OF FIRMS IN SCOPE 


Establishments revisited gave a variety of explanations as to how the 
original data were compiled. Of the 53 firms studied, 25 followed the 
instructions completely and correctly; a few included salesmen and 
office workers; some estimated earnings; a few included overtime and 
10 simply could not recall What was done. The most serious downward 


DISTRIBUTION OF ERRORS IN REPORTING 


NUMBER of WORKERS EARNING 75 ond UNDER 80 CENTS Per HOUR 
NUMBER OF By Number of Firms 
tinus: = e June 1951 


TECC 
NUMBER OF PRODUCTION WORKERS MISCLASSIFIED 
» 


c 
UNITED STATES DEPARTMENT OF LABOR 
BUREAU OF LABOR STANSTICE 


CHART 2 


bias resulted from two firms which reported guaranteed rates instead of 
actual earned rates. The main difference between the original survey 
and the resurvey is due to the errors of these two firms and to the omis- 
sion of the one firm with 100'employees that refused to cooperate in 
the resurvey. | 

А compariso was made of changes in reporting in the 75 and under:80- 
at y oups (omitting the two figms which misinterpreted the instruc- 

10Пп8. A = 

A frequency distribution of the errors was constructed (Chart 2). 

The X variable was the number of\production workers misclassified in 
е 


250 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 _ 


the 75 and under 80 cent class, Positive errors were workers errone- 
ously included in the class, and negative errors were those who should 
have been included. The Y variable was the number of firms which 
misclassified workers. 

This frequency distribution of reporting errors showed an arithmetic 
mean of —0.4. Thus, the critical class wasless than half a worker per 
firm too large in the original survey. In order to see whether the error 
in this class is of a compensating nature, the degree of asymmetry was 
established using Pearson’s measure of skewness which equaled 0.05, 
Thus, the error distribution was almost symmetrical and the errors are 
to a great extent compensating, j s 

A regional pattern in errors of reporting is noticeable. The closest 
correspondence between the two forms of collection is found in the- 
New England region. For the bulk of the firms agreement was prac- 
tieally perfect at nearly every point in the distribution. No mail ques- 
tionnaire was of such quality as would necessitate rejection. The poor- 
est mail reporting came from the Middle Atlantic region where 5 of the 
22 schedules collected by mail differed so much from the field resurvey 
that they could not be termed usable. Of the 10 comparisons in the rest 
of the country only 1 mail questionnaire was unsatisfactory. Only one 
of the 13 schedules obtained by the ILGWU resurvey was sufficiently 
different from the original mail schedule to warrant rejection of the 
latter. - 


А SUMMARY AND CONCLUSIONS 


1. The data indicate that collection of wage distributions by mail 
questionnaire is feasible in piece rate industries and js also feasible in 
industries paying hourly rates. 

2. No consistent reporting bias was evidenced. Most of the reporting 
errors were not only compensatéry, but were generally small. 

3. The major reporting error in such surveys is the confusing of 
earned and guaranteed rates. Care must be taken in design of the 
questionnaire and in the accompanying instructions to distinguish be- 
tween those two items. Careful editing will frequently disclose whether 
large concentrations of workers at any one rate are caused by report- 
ing guaranteed rates instead of earnings. Personal visit may be neces- 
sary to clarify such cases. Bs 

4. A field follow-up of respondents classifying themselves as out of 
Scope is necessary in an industry characterized by frequent changes in 
product. Without such a field follow-up an accurate estimate of the uni- 
verse is difficult, and may result in’ ‘overrepresentation of some seg- 
ments of the industry in the survey totals, . 

? Ў 


4 


> 


0 
INDUSTRIAL CLASSES IN THE UNITED STATES 
1870 TO 1950 


ТплмаАч M. Soaen 
* St. Olaf College 


N AN earlier article the writer classified the gainful workers in the 

United States for 4870 to 1930, and the labor force in 1940 into 
industrial classes.! This paper brings this classification up to date by 
adding the data for 1950* 

The fact that the occupational classifications used in the Census of 
41950 and the Census of 1940 were quite comparable made the task of 
adding the data for 1950 to the previous compilations comparatively 
easy? i 

Some interesting developments have taken-place during the decade 
of the 1940s, a period during which our economy was affected very 
markedly by war and postwar changes. In the main these changes have 
not altered the direction of trends indicated in earlier decades but the 
degree of change has in some instances been increased. Some of the 
more important changes from 1940 to 1950 are: 

1. The number of persons for whom occupations were reported in- 

creased from 50,737,284 in 1940 to 57,632,879 or an increase of 

13.5 per cent which is a considerably greatey rate of increase than 
during the depression decade of the thirties. = D 

2. The number of persons engaged in farnting occupations continued 

to decline ih spite" of the large increase in the*total labor force. 

Table I indicates that the number of farm laborers has~declined 

from 6,143,998 in 1910 to 2,497,437 in 1950 or a decline of over 59 

per cent im 40 years. The, decline in relative importance of agri- 

cultural occupations is clearly evidenced by a decrease from 17.4 

per cent of the total in 1940 to 11.8 per cent of the total in 1950. 

3. Both proprietors and officials and professional classes have shown 

a significant increase in numbers during the past decade, but the 

greatest increase in numbers is indicated in the lower salaried 

‘Tillman M. Sogge, “Industrial Classes in the United States in 1940,” Journal of the American 

Statistical Association, 39 (1944), 516-18, Parfilel data for the years 1870 through 1920 were first pre- 

tented by Professor Alvin H. Hansen in the two following articles (1) "Industrial Class Alignments in 

tho United States," Journal of the American Statistical Association, 17 (1920) 417-25, and (2) “Industrial 

Classes in the United Sjates in 1920," Journal of the American Statistical Association, 18 (1922), 508-00. 

The data for 1930 were first presented in an article by the writer entitled, “Industrial Classes in the 

United States in 1930," Journal of the American Statistical Association, 28 (1033), 199-203. 


2 Coverage of the different classes has boen,described in the earlier articles in this series listed in 
the footnote above. ba * 


Ё: * с 
281 i 


0961 oq» позу popuroo әләм wep әчү, -рәзлойәл you әләм suom; 


“996-1 03 193-1 “dd ‘рст әд, * 


Aivurumg 899848 PIN ‘I Heq :uonsqpdoq 9Y} jo sonstroowrwqo үү eumqoA ‘uoyemdog jo ensuer) 
sdnooo woga 10] 0010] 1oqw[ əy} ur suosiod рәәпәпәйхә ¥90'998'T әрпрочт you op somay SJ, y 


ry 1% ys re 09 98 £'6 9 X85 Базан Рәртәғарцу 

[4:2 2°68 61g vw 88 '9g vee *»'og 9°9% 

S og r$ re 194 o'g v [3] 874 

961 6°91 (Aa 9'6 e'g oF gu oe sz 

»'or 8'8 [^ 9'9 vs ve [37 s'e £'g 

8'8 £8 18 OL 9 сө [E 9 oF 

9'L sor тст "ет 2°91 8'6r 9'ec 9'vc 0° 

et 69 06 осот төт g'or z'er T'er 19:3 

ae 
+0961 Over ogor 0261 otet оовт оввт' 088T 0481 
(que0 194) 
SASSVIO 'IVPLLSQGNI is 
E II 3IgV.L 

O.8'CC0'29 — vez'201'09 086°028"8 — SyG'VIO'TP 9E£'291'8£ — gez'ez0'es — TO0'0GL'Gz  660'GOB'LI  sco'gog'zr 
968'$98'C ^ OP0'PSE'Z ^ — Oco'cIo'c IL0'S0I'C 800186 — $90'/9V'G —— SOV[SII'G — gol'Oc'i *II'OIO'T 
VIG'I0L'VZ — OZG'290'00 — Oyo'cio'sr GL0'SPO'LI .. 66'099'PI  609'e92'00 — zWP'COE'Z 6c8'98c'o Ige'sce'g 
Bee‘ IFT Ic0'0v8'c £81‘ 666‘T 966‘ 02z'T рс LL9'egp't 162‘ v9 T 999°920‘T Y8L'926 — 
TOS'?9Z'II — TOZ'IZ0'8 — IS'OIT'A  оок'овб'е 0290685 Scó'Gc&'I — *cgg'goo 82h‘ 679 SIF‘ 608 
SLT‘ 896'g T88'v9p'y [LE 061'092'z ©64'920'& 989°с96*Т 209‘ FIT'T see‘ 999 [5212 
429‘ 9%0'9 106" L61} 998‘ 02a‘ ST»'S9I'g ©60°6/8'& STL‘ TI8‘T 628" LPT 690° 108 828189 
IZ‘ Lee‘ —— 6r0'Sec'G ¥82'620'9 802*29р'9 1916259 862*022*9 ISI'04£'9  +20'z8z'> zz 000‘e 
1£9'10$'C  — giz'ooo'g 92'cog'v 4£9'821* 866'gPI'9 28'015} 190‘ 00's 918'ece'g 966‘ 988° 


SASSVIO IVIMLSQGNIT 


I WISVIL 


* 
INDUSTRIAL CLASSES IN THE UNITED STATES 253 


group and in industrial wage earners. The increase of nearly five 
million industtial wage earners during the past decade gives evi- 
dence of the expansion in our national economy which has taken 
place. Even though there has been this large increase in the num- 
ber of industrial wage earners during the past decade, and an 
increase in relative importance of this group from 39.3 to 42.9 
per cent of the total, it is significant to note that the importance 
of the industrial wage earners in relation to all the non-agricul- 
tural classes combined was less in 1950 than in 1920. 

4. The number of servants in 1950 was less than half the number in 
1940° dropping from 2,840,021 to 1,414,732. Percentagewise, 
servants constituted 5.6 per cent of the total in 1940 and only 2.5 

® per cent of the total in 1950. 


e 


STATISTICAL METHODS FOR POISSON PROCESSES 
AND EXPONENTIAL POPULATIONS 


ALLAN BIRNBAUM* 
Columbia University 


1. INTRODUCTION 


HE research worker and statistician frequently deal with phenom- 
T in which events of some type occur randomly in time, or in 
which particles are randomly distributed in space. The Poisson proc- 
ess is the formal model of such phenomena. Furthermore, many phe- 
nomena which may naturally be represented by use of the exponential 
distribution or the Poisson distribution can alternatively be repre- 
sented as Poisson processes and dealt with advantageously in this form. 
Statistical methods for the study of such phenomena can be as flexible 
and yet simple as the Pcigson process model itself. A number of such 
statistical methods are described and illustrated below. Some of these 
methods were developed recently, while others are well-known but are 
described here briefly for comparison and completeness. 


2. THE POISSON PROCESS 


The results of any experiment in which observation is performed 
continuously and “events” (i.e., occurrences of any specified kind) are 
tallied, can always be described by a function z-—z(0), which gives the 
number of events observed, г, during the first ¢ units of observation, for 
all values of і from 0 throvgh T, the tctal amount of Observation per- 
formed. Such an experiment, yielding an observed function x(t), is a 
Poisson process if the events occur randomly in the sense of the follow- 
ing natural definition: given that айу number т of events are observed 
in any amount t of observation, the points of occurrence of the x events 
are randomly (ie., independently uniformly) distributed between 0 
and t. Examples in which the Poisson process is a very accurate and use- 
ful model are the following: 

Example 1. When a Geiger counter is used in an essentially constant 
environment, the counter tallies the number of times it is hit by radio- 


active particles. Here the amount of observation performed is meas- 


ured in units of time during which the counter is operated. The Pois- 
son process represents this phenomenon quite accurately. (The as- 
sumptions for the Poisson process will be violated seriously under ex- 
treme conditions: In an environment of very high radioactivity, the 


* Research Mopar by the Office of Naval Research. 
254 


| 


MEX. 


* 


STATISTICAL METHODS FOR POISSON PROCESSES 255 


counter’s inability to record a hit extremely soon after an earlier hit 
will seriously violate the independence assumption. If the environment 
contains radioactive materials with extremely rapid rates of decay, the 
assumptions of equal probabilities will be seriously violated.) 

Example 2. If any homogeneous or well mixed material containing 
practically infinitesimal foreign particles or flaws is inspected, the 
amount of observation will be measured by the volume (or area or 
length) of material inspected, and the “events” will be the particles or 
flaws observed. Опе°саѕе is the use of a haemocytometer to count the 
number of blood cells distributed over an area on a slide. (For studies 
of the magnitude of déviations from the independence assumption due 
to “crowding” of cells, see [1] and the references given there.) Another 
case is that of inspection of manufactured materials by length or area 
for defects, if the manufacturing process is such that there is no ap- 
preciable dependence in the locations of defects and the average defect 
rate does not change appreciably during the'period of inspection. Other 
applications, and some statistical methods discussed below, will be 
found in [18]. 

A Poisson process can be characterized in the following two simple 
alternative and equivalent ways: 

(a) The “waiting times" w between successive events are independ- 
ently distributed with the exponential density function 


glu) = (1/0)e-"* | sfor u 2 0. 
Here 0 is the mean of waiting times: x " 


Ы 


Э > E(u) = 6. D 
(b) The increment y=2(t)—2(t) of z(t) on any interval of length 
d=t—ty has the Poisson distributita 
(CO 


y=0,1l, 
y! 


ply) =e® 


and the increments of z(f) on non-overlapping intervals are independ- 
ent. Here dA is the mean increment оп an interval of length d. Thus ^ 
is the mean rate of occurrences, and &—1/0.* ~ 

All statistical questions cencerning Poisson processes involve infer- 
ences about the value of the single parameter à or @ of one process, or 
about the values of the respective parameters “of several processes. 
(Methods for determining whether a given process is Poisson will not 
be discussed here; Davis [6] describes and applies several methods.) 

The flexibility Antl simplicity of statistical methods applicable to 

e 
X 


e 
€ 


256 AMERICAN STATISTICAL ASSOCIATION J OURNAL, JUNE 1954 


the life-testing procedures developed by Epstein and Sobel [9, 10]. It 
is assumed here that the lengths of life of a type of electron tube, say, 
are exponentially distributed with unknown mean life 6. (For some 
empirical tests of this assumption, see [6].) If one tube is observed until 
failure, then replaced by a new tube which is again replaced upon its 
failure, and so on, we have a Sequence of observed lengths of life u from 
an exponential distribution. Hence the number of failures observed by 
any time ¢ is a Poisson process x(t). Now suppose that we place any 
number of such tubes under observation, and add new tubes or remove 


tubes from observation in a quite arbitrary manner, The information ' 


given by such an experiment is fully described as a single Poisson 
Process, where ¢ is now taken to be the number of tube-hours of life 
observed at any point inthe procedure, and x(t) is the number of fail- 
ures observed up to that point. The feature illustrated here is called 
the additivity property of the Poisson process, 

The second source of flexibility of statistical methods is the fact that 


tributions which characterize the Poisson process, the exponential and 
the Poisson distributions, : 


Е; " 


» $ ESTIMATION OF ^ OR 6 


L=M, ? 
u* 

Pr (z) = rb forz—-0,1,2,-.. 
z! 


By standard methods We can construct a confidence interval for n, at 
a desired confidence level. An exact construction ean be based on ta- 
- bles of the Poisson distribution, and extensive tables of such confidence 
intervals as well as tables of the Poisson distribution are available in 
[13, 17, 20]. à 
Method 2. If observation of a Poisson process is continued until a 
specified number m of events has been cbserved, and if T is the amount 
of observation actually required to observe m events, then 2T has the 


LER 


STATISTICAL METHODS FOR POISSON PROCESSES 257 


chi-square distribution with 2m degrees of freedom. A confidence in- 
terval for \ at a desired confidence level can then be constructed in the 
standard way, as described below. 

It will be useful here and also in connection with other statistical 
problems discussed in the following sections to note that statistical 
problems dealing with vatiances of normal populations have direct 
analogues in problems dealing with parameters of Poisson processes. 
The analogue of the present problem is that of constructing a confi- 
dence interval for tle variance o? of a normal population given 5°, 
the sum of squared deviations from the known or estimated mean, with 
2m degrees of freedom {Һе confidence interval is constructed by using 
the fact that 5/02 has the chi-square distribution with 2m degrees of 

* freedom. Thus 27/0 and S?/o? have the same distributions, and so do 
the statistics 27 and S? if the corresponding parameters 6 and g? are 
equal. Here and in later problems we can take advantage of this cor- 
respondence to apply statistical methods deVeloped originally prima- 
rily for a problem dealing with normal variances to a corresponding 
problem dealing with Poisson processes, and vice-versa. 

To construct the desired confidence interval for ^, we select from ta- 
bles of the chi-square distribution with 2m degrees of freedom, values 
C and D such that the probabilities of values less than C, and greater 
than D, are each «/2. Then we can write 


sie 2) 
l-a= <2Т&р}= sis 
a = Prob {C < 247 < D} = Prob {os Son 


* 
Thus (C/27', D/2T) isa confidence interval for ^, with confidéfice co- 
efficient 1 — о. The small bias of such confidence intervals could be re- 
moved at the price of complicating the procedure. 
The following method of constructing point estimates of ^ based on 
Method 2 was developed in [15]. We have with probability 1— œ that 


C/2T <r « D/2T. 
Given this inequality, the maximum percentage deviation of ^ from an 
estimate № is minimized by taking 
D 
MESI 
4T 


' This maximum percentage deviation is 
De— С 
e т = *— 100%. 
I: ы. : 
p^ 2t 


258 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


т is a quantity which decreases as the number m of events observed 
increases. Hence we can use tables of the chi-square distribution to de- 
termine the number m of events to be observed to give an estimate № 
with maximum percentage deviation not exceeding any given positive 
т, at confidence level 1 —a. 

On the other hand, when Method 1 is used, there is no way of pre- 
scribing the amount ¢ of observation so as to obtain point estimates of 
bounded percentage error. As shown in most statistics texts, an ap- 
proximate construction of the confidence interfals of Method 1 is 
given, say at the .95 confidence level, by the interval 


z  1.96./x = — 1.96 /z 


cm E 


t t t t 


Tt is seen that here the maximum percentage deviation of \ from the 
estimate z/t is (1.96/ %/2)71009, a function of the observed value of т. 

Method 8. In Method 1, the absolute deviation of the estimate c/t 
from ^ is bounded by Ca(/x/t), where Co is a constant correspond- 
ing to the confidence level used, with probability 1—a. In Method 2 
the corresponding bound is N7/100%. In neither case can the magni- 
tude of this bound be prescribed in planning the amount of observa- 
tion ¢ or m, since each bound is determined by the outcome of the ex- 
periment, The following méthod gives estimates of prescribed absolute 
precision. ° a 

The problém here is to give an estimate M of А such that with proba- 
bility at least 1—a, |А А} Se where.a and e are given positive con- 
stants. Let n be a positive integer. Observe Т,, the waiting time re- 
quired for the occurrence of n events. Let c=ae?/2n. Perform addi- 
tional observation of the process fðr 1/2cT', units of time; let X be the 
number of events observed in this period. Let X' =2cT,X. Then № is 
an estimate meeting the stated requirements. 

The stated properties of № may be demonstrated by verifying that 
N has expected value \ and variance 2nc. Then Tehebycheff's inequal- 
ity gives 


Prob {|x —a| se} xi1-295.4.., 


It remains for further investigation to determine the largest constant 
œ which can be used to determine the additional amount 1/2c’T, of 
observation without destroying the debired properties of the estimates 
2c T,X. Likewise, rules for the optimal choice of 7 remain to be inves- 
tigated. А › 


j 


| 
| 


Ыт Тг 


„э 


STATISTICAL METHODS FOR POISSON PROCESSES 259 


4, TESTS OF HYPOTHESES ON À OR 0 


0 
A test of a hypothesis specifying the value of à (and hence of 0=1/)), 
say the hypothesis Ho: \=Xo, is provided by each method of construct- 
ing a confidence interval for à. If А and В are confidence limits for А 
constructed by one of the methods of the preceding sections at confi- 
dence level 1 —o, then a test of Hy at the 1—a significance levelis ob- 


. tained by accepting Ho if A SX SB and otherwise rejecting Ho. Tests 


of Hy against one-sided alternatives (e.g., X») can be based on corre- 
sponding one-sided confidence limits for №. Tests with prescribed power 
can be designed by use ef the Poisson or chi-square distributions. As 
these procedures are relatively standard, they will not be discussed 


, further here. 


In each of these testing methods, an economy in the amount of ob- 
servation required can be achieved by taking advantage of the fact 
that observation of the Poisson process can be performed continu- 
ously. For example, consider tests of Ho: MSS against the one-sided 
alternative А>. In the type of test based on a prescribed amount t 
of observation, the test procedure will amount to rejecting Hy if the 
observed number of events т exceeds some “critical number" ха. Hence 
if z,--1 events have been observed at any point in the procedure, ob- 
servation can be terminated and Hy rejected at once. In the type of 
test based on observation of a prescribed number m of events, the test 
procedure will amount to accepting Ho if the required amount 7 of ob- 
servation is at least equal to some "critical amount” Te. Hence if m 
events are not observed before t= Т„, observation can then be termi- 
nated and Ho accepted at once; while if m events,are observed at or 
before #= Т,, observation can then be terminated and Ho rejected at 
once. 

Another type of test procedure (which has been investigated by two 
groups of statisticians, in [8] nd [9]) achieves optimal economy in the 
required amount of observation at the price of a slightly more compli- 
cated statistical procedure. Such tests are now available only for testing 
Ho against one-sided alternatives. (Two-sided tests of this type can in 
principle be constructed, by the method used by Wald in [23], pp. 134- 
37, but this has not yet been done because of certain technical difficul- 
ties involved.) These tests cag be described as typical sequential prob- 
ability ratio tests of the kind developed by Wald in [23] and [24], with 
account and advantage being taken of the fact that^the Poisson process 
тау be observed continuously. The typical form of these procedures 
is the following: Continue observation only so long as the number z of 
events observed andthe amount ¢ of observation performed satisfy the 
inequality oe 9 

ЬЯ 
е 


260 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 
b-Estz«a- st. 


As soon as z €b--st, accept Ho; as soon as zz a-L-st, accept Hi: Х=М 3 
+A. Application of the method may be simplified by a graphical repre- ^ 
sentation of the test inequalities in the (¢, x) plane. Here | 


аш (лу), «= (2) (в) 
з = A loj ». @= (10 o Я 
КАХА d PLA 


and s 


8 % 
ifia (ve; Е 3 (ws + i) 


where o is the desired maximum probability of accepting Н, when — 
A<, and £ is the desired maximum probability of accepting Ho when ^ 
AZe-rA. Ы 

In [10] a number of life testing procedures using Method 2 were in- | 
vestigated, and their operating characteristics and required amounts of 
test equipment and time determined. 


5. COMPARING TWO POISSON PROCESSES 


. Comparisons of two Poisson processes with parameters №; № re- ^ 
spectively are expressed most usefully in many applications by state- 
ments about the value of the ratio y 7X/M. The methods described 
in the following Sectidn 5A provide estimates and tests on ү. In other 
applications, it may be more appropriate to express comparisons of 
two processes by statements about the value of the difference А=М — 
— №; the,rather restricted methods available for comparisons in terms — 
of A are described in Section 5B. The following examples illustrate 
possible reasons for choosing eith? y or A as a criterion of comparison. — 

Example 1. In order to measure the effectiveness of a shield intended 
for protection against radiation, we may take any steady source of 
radiation and a Geiger counter in fixed positions, and record “hits” by 
radiation alternately with the shield interposed between counter and 
source and with the shield removed. The natural measure of effective- < 
ness of such a shield is the percentage of radiation it eliminates. If s _ 
is the intensity of radiation at the counter when the shield is present, 
and M is the intensity when the shield is removed, then the effectiveness 
of the shield is 100[1— (М/%м) ]%. A confidence interval estimate of 
у= М/\ provides statistical information in an appropriate form, such — 
as, for example, “At confidence level -99, we estimate the effectiveness 
of the shield to be at least 98.4%.” oss 


= 


STATISTICAL METHODS FOR POISSON PROCESSES 261 


Example 2. In order to decide which of two samples of ore shows 
greater radioactivity, as a basis for selecting one of two lots of ore for 
refining, we can record radiation from each sample with a Geiger 
counter. If M, № are the respective intensities of radiation at the 
counter, the degree to which one lot is preferable to the other is repre- 
sented by A = № — №, for the gain to be achieved by a correct choice of 
a better lot is proportional to | Al . An appropriate statistical procedure 
is a test of the hypothesis Hi, that М >, against the alternative M2, 
satisfying the requirement that if [Al ZA', where A’ is the smallest 
value of [д | of practical significance, then with probability at least 
1—a the better lot will be chosen. 


s 5A. COMPARISONS IN TERMS OF ү=Мм/Мм 


Method 1. If a prescribed amount t of observation is performed on 
each of two Poisson processes with parameters №, № then the numbers 
ж, 22 of events observed in the respective processes will have Poisson 
distributions with means ш =, р = №. Then ш/ш=М/м= ү. It has 
been shown in [19] and [22] that all information on the value of y con- 
tained in the observations лу, 22 is obtained by treating the observa- 
tions as a binomial sample of m=m+a2 observations in which zi 
“successes” are observed, and the probability of a “success” is 

№ 1 msn 


p= 
A + № 14 XM 1+7 


Thus to test a hypothesis on the value of узу Ho: y=1 (ie. m=), 
we consider the dorrespónding test of Ho’: : 


1 1 
= i to aun. 

d Vos a 1t rz 
When 21 successes are observed in m binomial trials. We reject Hy if 
and only if Hy’ is rejected, and at the same significance level. Again, to 
construct a confidence interval for y, we can first construct one for p 
(as described in [4]) from our sample of 2 successes in m trials. If we 
obtain, for example, “at the .95 confidence level, we estimate that 
33 SpS.50,” then we can use the equation p=1/(1+7), or y= (1/p) 
—1, to obtain the equivalent “at the .95 confidence level, we estimate 

that 1€ 22," e 
Method 2. Axt important weakness of Method 1 is that the procedure 
cannot be planned to provide anf minimum amount of information on 
ҮЙ the values of №, X2afe completely unknown. As we have seen, the 


e 


е 


262 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


method provides information equivalent to a binomial sample of 
m —23--2» observations, but m itself is a random observation and may 
in any particular instance be too small to provide sufficient information 
(it may even equal zero, and provide no information about ү). Actually 
т has a Poisson distribution with mean m- u so that, if the unknown 
values of both ш and и» are very small, Method 1 will very often give 
very little information about y; even if ш-Еи»з is quite large, there will 
occur some samples with m too small to give sufficient information 
about y. 

One way out of this difficulty is apparent. If at least m’ binomial ob- 
servations are required to provide enough information about p (and 
hence about y) in some application, and if an application of Method 1 
gives m<m’, another application of Method 1 with an additional 
amount t of observation of each process may be possible, and if neces- 
sary additional amounts of observation until the total number of 
events observed reacheg*or exceeds m’. Then the usual binomial statis- 
tical methods can be applied to the totals of events observed in each 
process, as in Method 1. 

Method 3. If two Poisson processes can be observed continuously, 
with observation on each performed at the same rate, then the proba- 
bility that the ith event observed will occur in the first process is 
p=1/(1++4), for each i=], 2, - - +. Such observation thus provides a 
Sequence of binomial observations of indefinite length, to which the 
statistician may apply any of the sequential or curtailed-sampling meth- 
ods fombinémial data (e.g. Wald [23]), or the statistician may termi- 
nate observation when s prescribed number n of events has occurred, 
and apply non-sequential methods for binomial samples of size n. 
(Methods 2 and 3 were developed in [2].) 

Method 4. An alternative way of prescribing observation of two 
Poisson processes is to require that the first be observed until m 
events have occurred, and the second until na events have occurred 
in it. If T; is the amount of observation of the first process required to 
Observe n; events, then as we have seen above 2T, has the chi-square 
distribution with 2n, degrees of freedom ; the corresponding quantity 
for the second process, 2\73, has the chi-square distribution with 2m 
degrees of freedom. 

The present problem is statistically equivalent to that of comparing 
variances of two normal populations, given sample variances with 2m 
ава 2n; degrees of freedom respectively. To test whether M=), we 
compute F'=nsT,/mTs, and use the fact that F has the F-distribution 
with 2m, 2n; degrees of freedom when М =»: To give a confidence in- 


| 


`* 


263 


_ terval for ү= М/\, ye may use the standard method developed for the 
corresponding problem of giving a confidence interval for the ratio 
12/0? of two normal variances by use of the F-distribution. These 
methods were developed in [11] and [21]. (Cox [5] has investigated the 
effects of applying to à Method 1 experiment the simpler analysis of 
Method 4.) Е 

In some applications the present procedure would be more easily 
applied than those of Methods 2 and 3 above; for example if the proc- 
esses are separated in Space or time the present procedure might be ap- 
plied. If simultaneous observation of the processes is possible, Meth- 
ods 2 and 3 would generally give earlier termination since in Method 4 
if Mis much smaller than №, T; will generally be much larger than Т. 


5B. COMPARISONS IN TERMS OF A=)2—)i1 


Method 1. Let two Poisson processes with parameters №, № be ob- 
served continuously so that the respective waiting times for the first 
event in each process, T; and U; say, are obtained, then the waiting 
times T», Uz for the second event in each process, and so on. We thus 
obtain a sequence of pairs (T;, U;) of waiting times, for each 7= 1, 
2,-..; the Ts have an exponential distribution with mean 6; 
=1/\, the гв an exponential distribution with mean 0;— 1/X. Using 
such observations we can solve the testing problem stated in Example 
2 above. а 

To do this we apply Girshick’s [14] method to the problem of ranking 
the two exponential population with respect to their means; the result- 
ing test procedure is аз follows:cAfter each pair (T; Ui) of observed 
waiting times is obtained, compute AE 
Z — АУ (Te — 0). 


aimi 


e < 


Continue taking observations until either Z=a, in which case accept 
the hypothesis Hy: № —X = (1/6) — (1/0) 2A, or until Z <b, in which 
Саве accept the hypothesis Ho: М—М= (1/0) — (1/0) 2А. Here a=log 
; п —B)/a, b=log (8/1— а), where о is the desired maximum proba- 
bili ty of rejecting Hy when Н is true, and £ the desired maximum prob- 
ability of rejecting H, when Hj is true. It has been shown in [3] that, 
Unfortunately, Girshick’s method can not be generalized to deal with 
Problems like testing whether Xi—3;2:1/6:— (1/0) =0 against the al- 
ternative X—X21/6— (1/6) =A. 

Method 2. Consider the problém of constructing an estimate A* of 
A=M—M, where №, «А йге the unknown parameters of two Poisson 


e 
є 


fy vs ЛА 


264 | AMERICAN STATISTICAL ASSOCIATION боб. NAL, TUNE 1054 
processes, such that with probability at least 1—8] A* =A} Ss, ‘where 


В and т are given positive constants. One solution'is given by. setting —' 
A*=),*—)i*, where M* and № are obtained as in Method 3.06 Section. ' 
3 above, taking є=л/2 and (1—a)?=1—f. In case the two processes ` 
can be observed simultaneously, a more efficient solution is the fgllow- 
ing: Let т be a positive integer. Observe Т m, the waiting time required 7 
for the occurrence of a total of m events when the two processes are 
observed simultaneously. Let @= 87?/2m. Perform additional observa- 
tion of each process for 1dT,, units of time ; let Fi, Y? be the respective | 
"numbers of events observed in the two processes in this period. Let | 
д "A" —2dT,(Yi—Y;). Then A” is an estimate'meeting the stated re- l 
` quirements. The estimates A* and A” provide useful simple solutions 
of the common Geiger counter problem of estimating the difference 
between “noise” count rate and “source plus noise” count rate. 
The stated properties of A” are verified as follows: E(A") 2A- 
E(A" — A)? -2md. Then бу Tchebycheff’s inequality 


Prob. {]A” — A] S 1} = 1.— 2md/g = 1 — &. 


Further investigation should provide rules for optimal choice of n or m, 
and for shorter periods of additional observation which will suffice to 
meet the stated requirements, 


6. COMPARING THREE OR MORE POISSON PROCESSES 


Generilizations of 5wo of the preceding methods for comparing two 
Poisson ‘processes are available for comparing any number k of proc- 
esses, 

The generalization of Method 4 of Section 5A is the following: Let 
T; be the amount of observation required for the observation of a pre- 
assigned number n; of events in,the ith process, for i=1, - · - k. Then 
2:7; has the chi-square distribution with 2n; degrees of tréedom, where “4 
№; is the parameter of the ith process, for each 7. Thus this problem is 
statistically equivalent to that of comparing variances of k normal 
populations on the basis of sample variances based respectively on 
2m, * - - 2n, degrees of freedom. For the problem of testing whether k ж 
normal variances are equal, tables of critical values are available in | 
[16]. Let us denote by 8 the sum of squares for the ith normal popula- 
tion, based on 2n; degrees of freedom, used in this test procedure. Then 
if we substitute fcr each S; the observed value Т, for i=1, - · -, ky ! 
aiid then proceed formally with this test procedure, we obtain a test of | 
the hypothesis that the Ё Poisson parameters are equal; this test has 


| 


nda 


‚эттип! 'HODS FOR POISSON PROCESSES 


the same significance, level as the original test, and a РО 
power function. 

Every ‘procedure for comparing normal variances а be similarly 
pt d to give a corresponding procedure for comparing Poisson 
process parameters. 

Method 3 of Section 5A can be eet as follows: Let k Poisson 
processes with respective parameters №, · • •, № be observed simul- 
taneously. Then the probability that the 7® Mm observed will be*ob- 
served in the itè process is 


e № PY, а 


pi = — > for each 7. 
Mets Ар 


Thus the observed events can be considered independent observations 
from a multinomial population with class proportions pi, * * * , px, and 
all methods for dealing with multinomial data can be applied. For 
example, the hypothesis that the processes have equal parameters is 
equivalent to the hypothesis that the corresponding proportions p; are 
equal: Ho:pi— · · + =рь=1/Е. A chi-square test of this hypothesis may 
be carried out in a kX(2 table; the chi-square statistic for this problem is 
generally termed the “index of dispersion.” The limitations of Method 
1 described above apply also to this procedure when a preassigned 
amount ¢ of observation is used. 

Further theoretical discussion of Poisson processes is given in [7] 
and [12]. 


се» 
e T 


REFERENCES € 
1] Berkson, Joseph, Magath, Thomas B., and Hurn, Margaret, “Laboratory 


estimated with the hemocytometer,” Journal of the American Statistical 
Association, 30. (1935), 414-26. 

2] Birnbaum, Allan, “Some procedures for comparing Poisson processes or 
populations," Biometrika, 45 (1953), 447—49. 


mitted for publication, 

4] Clopper, C. J., and Pearson, E. S., “The use of confidence or fiducial belts 
illustrated in the ease of the uberi ? Biometrika, 26 (1934), 403-14. 

5] Cox, D. R., “Some simple approximate tests for Poisson variates,” Bio- 
metrika, 40 (1953), 354-60. 

6] Davis, D. J., *An analysis of some failure data," Journal of the American 
Statistical TET 47 (1952), 113-50. 

7] Doob, J. L. » Stochastic Processes, New York, John Wiley and Sons (1953), 
98, 398-407. 

[8] Dvoretzky, A., Kiefer, J., and Wolfowitz, J., “Sequential decision problems 


2° 2 
US 


standards i in relation to chance порева of the erythrocyte count ав ' 


3] Birnbaum, Allan, “Some sequential tests for comparing populations,” sub- . 


n 
266 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


for processes with continuous time parameter,” „Annals of Mathematical 
Statistics, 24 (1953), 254-64. T 

[9] Epstein, B., “Statistical problems in life testing," Proceedings of the Seventh 
Annual Convention of the American Society for Quality Control, May 1953, 
385-98. 


[10] Epstein, Benjamin, and Sobel, Milton, “Life testing,” Journal of the Ameri- i 


can Statistical Association, 48 (1953), 486—502. 
[11] Epstein, Benjamin, and Tsao, Chia Kuei, “Some tests based on ordered 
. observations from two exponential populations," Annals of Mathematical 
Statistics, 24 (1953), 458-66. 

[12] Feller, William, An Introduction to Probability Theory and Its Applications, 
Vol. I, New York, John Wiley and Sons, Ine, (1950). 

[13] Garwood, F., “Fiducial limits for the Poisson distribution,” Biometrika, 
28 (1936), 437-42. 

[14] Girshick, M. A., “Contributions to the theory of sequential analysis I,” 
Annals of Mathematical Statistics, Vol. 17 (1946), 123-43. 

[15] Girshick, M. A., Rubin, H., and Sitgreaves, R., *Estimates of bounded 
variation in particle counting," Report at September 1952 Meetings of the 
Institute of Mathematical Statistics. 

[16] Hartley, H. O., “Testing the homogeneity of a set of variances,” Biometrika, 
31 (1940), 249-55. 

[17] Kitigawa, Tosio, Tables of the Poisson Distribution, Tokyo, Japan, Bai- 
fukan (1951). 

[18] Maguire, B. A., Pearson, E. 8., and Wynn, A. H. A., “The time intervals 
between industrial accidents," Biometrika, 39 (1952), 168-80. 

[19] Przyborowski, J., and Wilenski, H., “Homogeneity of results in testing 
samples from Poisson series," Biometrika, 31 (1940), 313-23. 


[20] Ricker, W. E., “The concept of confidence or fiducial limits applied to the 


Poigson frequency,” Journal of the American Statistical Association, 32 
(1937), 349-56. à 

[21] Sukhatme, P, V., “On the analysir of k samples from exponential popula- 
tions with special reference to the problem of random intervals," Statistical 
Research Memoirs, Vol. I (1936), 94-112. 

[22] Tocher, K. D., “Extension of, the Neyman-Pearson theory of tests to the 
discontinuous case,” Biometrika, 37 (1950), 130-44. 

[23] Lig Abraham, Sequential Analysit, New York, John Wiley and Sons, 

[24] Wald, Abraham, “Sequential method of sampling for deciding between two 
courses of action,” Journal of the American Statistical Association, 40 
(1945), 277-306. 


4 


APPLICATIONS OF THE CIRCULAR 
NORMAL DISTRIBUTION 


E. J. GuwsEL 
Columbia University* 


N A previous paper [8] the circular normal distribution was intro- 

duced. Now its practical use will be shown. The different proce- 
dures explained in the next paragraph will be applied to typical observa- 
tions taken from geophysfcal, vital and economic statistics. Finally, a 
discussion of unsolved problems will be given. 


1. PROCEDURE 


For simplicity's sake, we consider in the following only circular dis- 
tributions over the year. The observations ate grouped or can be 
grouped into twelve monthly periods. Then the time series is concen- 
trated in twelve frequencies or relative frequencies p, (v— 1, 2 · » · 12). 
In most cases p, stands for the mean frequency of the vth month, the 
mean being taken for all observed years. Since the parameters of the cir- 
cular normal distribution are invariant under a multiplication of all fre- 
quencies, it is irrelevant whether the absolute or relative frequencies 
are used. This important property allows the application of the circu- 
lar normal distribution to time series which byethemselves ‘do not 
constitute distributions in the usual statistical sense. Let prve any 
non negative numbers of arbitrary nature or difnension, corresponding 
to the months; then the series p,/ >? p, may be regarded as а distribu- 
tion provided that the summation makes sense and may be analyzed 
48 à circular variate. © 

However, їїбїг illogical calendar system the lengths of the months 
vary from 28 to 31 days. If the mode occurs in July, February shows а 
minimum leading to an apparent asymmetry. Conversely, when the 
Mode occurs in February, its deficiency of two days may create a 
hole. Since these artificial influences must be eliminated, let the ob- 
Served frequencies for January, March, May, July, August, October 
and December be multiplied by 30/31 =0.96774 and for February by 
30/28— 1.07143; let p. be the frequencies so adjusted. The year is 
thus reduced to 360 days. Therefore, the sum of the frequencies is no 
Dnger n= 2722, р, ү 


* Work done in б а Universi i d t from the 
m part as Consultant to Stgnford University and in part under gran! 
ging Foundation, ae tapi tan! 


о ^ 
‘ius. 


267 


268 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 _ 


To conserve this observed sum the adjusted frequencies are multi- 

plied by the quotient y 

+ tp tp H p t p t pot ро 
Pi + po! + ps’ + ps’ + pr + ps’ + Pro’ + pu 

In this procedure the observed frequencies for April, June, September, 
and November are preserved while the frequencies for the other 
months are twice adjusted. The adjustments must be used if the dif- 
ferences between the frequencies are small, say of the order 10%, since 
they are then strongly affected by the different lengths of the months, 
The adjustments are not relevant if the differences between different 
months are large, say if the largest monthly frequency is 10 times the 
smallest one. The second adjustment is unnecessary if the quotient @ 
differs from unity by less than 1%. 

The adjusted frequencies are attributed to the 15th of each month 
and may be traced as a linear histogram or on polar paper consisting of 
concentric equidistant circles and radii for each degree. The maximum 
frequency is traced at north and the following months are traced clock- 
wise. After the choice of a unit distance the square roots of the twice 
adjusted frequencies, 4/p,, are plotted instead of the adjusted fre- 
quencies p,” themselves. This procedure equalizes the areas of the ob- 
served and the theoretical distributions. Distributions differing with 
respect to the sample size n are thus traced in different scales. How- 
ever, a ‘uniform scalz is obtained if all adjusted frequencies p” are di- 
vided"By"their mean ф. This method has the advantage that dis- 
tributions with different values of n may be traced on the same scale. 
Tt has the disadvantage that the frequencies are no longer visible. 

In the use of circular normal distribution ! 


(1.1) Ws 


eh cos (a—ap) 


271400) 


L 
the mode o» is estimated from 
12 12 
(1.2) tan ао = У) sin a,/ У) сово, 
1 1 
where 
12 A, r 
(13) > 008 a, = July — Јап.+0.86603(Ацр. — Feb. --June—Dec.) 


+0.5(Sept.—Mar. +May—Noy.), 


: 


APPLICATIONS OF CIRCULAR NORMAL DISTRIBUTION 269 


12 
(L5 > sin a, = Ott. — Apr. 4-0.86603(Sept.—Mar.-- Nov. —May) 
б 1 


+0.5(Aug. — Feb. + Dec. — June). 


and the months are written instead of their frequencies p,". This 
determines o» only up to 180°. The exact location is found from the 
conventional diagram for the signs of the trigonometric functions. In 
the estimation of ao it is sufficient to calculate to whole degrees of the 
angle. " 
The parameter k is estimated from 
] P 


Ее) 


with the help of Table II! which gives k as function d. The parameters 
for the reduced values p,"/p are obtained in an analogous manner. 
The adjustments for months of equal lengths may be introduced into 
(1,8) and (1.4). This leads to 


12 

У) cos «, =0.96677( Jul. — Jan.) +0.86603 (June) 
CORR 

| +0.83726(Aug. — Dec.) 4-0. 5(Sept. — Nov.) 


7-0.48339(May — Mar.) 0.92789 Feb., 


12 LI 
> sin a, =0.96677 Oct. +-0.86603(Nov.+Sept.) == © 

m x j : 
+0.48339(Aug.+ Dec.) — Apr. 


—0.83726(Mar.-- May) —0.53571 Feb. —0.5 June. 


Since the parameters ао and & are invariant under а multiplication 
of the frequencies no adjustment is then necessary to conserve the 
sum of the observations. However, for the comparison to the theory the 
observations must again be adjusted for months of equal lengths and 
for the conservation of the observed sum. Therefore, this system is not 
used in the examples that follow. Е 

In another analytic procedure a constant value equal to the mini- 
mum monthly value 7 is assurhed to hold throughout the year. This 
Component, is treated as a circular uniform distribution. The other 
Component consisting of the observed frequencies p,—7 is considered 
55 а circular normal distributiongwith n’=n—12p. However, up to 

! Tables with Roman numbers, rder to thé previous publication [8]. 


270 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


now no case has occurred in which this procedure gives a better fit 
than the previous one. 

The theoretical values corresponding to the observations урь are 
obtained after multiplying the values given in Table III by 0.28868 
Мп. И k is nearly equal to one of the values given in Table II, it is 
sufficient to use a slide rule. The theoretical values corresponding to 
reduced frequencies p,/7 are taken from III without any multiplication. 


10 


PERCENTAGE 


CIRCULAR NORMAL DISTRIBUTION р б) 
AND DIFFERENCE OF PROBABILITIES MO (X) ` 


rex 


The theoretical values are plotted on the polar diagram as points. 
A continuous symmetrical curve is obtained by joining the points to 
the left and right of the mode by the same parts of a French ruler. If 
the modal direction is practically zero or à multiple of 30 degrees, i.e. 
if the observed mode is either conserved or shifted by these amounts, 
we may use differences of probabilities A®(qa) instead of the densities 
Ф(а). This procedure leads to a comparison of the observed with the 
theoretical wedge diagram. It has the advantage that the conventional 
criteria for the goodness of fit can be used. D 

The differences of probabilities (Table 1) were obtained by taking 


, 


4 


APPLICATIONS OF CIRCULAR NORMAL DISTRIBUTION 


TABLE 1 


THE CIRCULAR NORMAL PROBABILITY DIFFERENCES дФ(о) 
FOR 12 AREAS, OF 30°, 


CENTERED ON: 


271 


1 


k Mean +30° +60° +90° +120° +150° +180° 
0.0 .08333  .08338 .08333 .08333 .08333 .08333 .08333 
0.1 .09176 .09056 .08735 .08314 .07912 .07630 .07530 4. 
0.2 -10054 .09793 .09111 .08255 .07476 .06953 106770 ^" 
0.3 .10962 .10539 .09458 .08158 .07031 .06304 .06058 
0.4 .11895 *.11286 .09774 .08024 .06581 .05690 .05394 
0.5 .12846 .12032 .10054 .07858 .06133 .05110 .04780 
0.6 ^" .13810 .?2768 .10296 .07662 .05690 .04570 .04217 
0.7 .14782 .13491 .10501 .07440 .05256 .04069 .03704 
0.8 .15755 .14196 .10666 .07196 .04837 .03608 .03239 
0.9 .16726 .14880 .10793 .06933 .04434 .03186 .02822 
1.0 .17690 .15539 .10882 .06656 .04049 .02804 .02449 
1.1 .18644 .16171 .10934 .06369 .03686 .02459 .02118 
1.2 .19584 .16772 .10953 .06076 ,.03345 .02149 .01826 
1.3 .20507 .17345 .10938 .05781 .03026 .01872 .01569 
1.4 .21413 .17885 .10895 .05484 .02730 .01627 .01344 
1.5 .22298 .18395 .10824 .05190 .02457 .01410 .01149 
1.6 .23163 .18874 .10729 .04901 .02206 .01219 .00979 
ац .24007 .19320 .10613 .04617 .01978 .01052 .00833 
1.8 .24829 .19738 .10476 .04344 .01768 .00906 .00707 
1.9 .25630 .20127 .10324 .04078 .01578 .00778 .00599 
2.0 .26409 .20487 .10158 .03822 .01407 .00668 .00506 
2.1 .27167 .20822 .09979 .03578 .01252 .00572 .00427 
2.2 .27905 .21130 .09790 .03345 .01113 .00489. .00360 
2.3 .28623 .21415 .09592 .03124 .00988 .00418 .00303 
2.4 .29322 .21677 .09388 .02914 .00876 .00359 00255 
2.5 -80003 .21917 :09179 .02718 e .00776 .00304 .00214 
2.6 .30666 .22138 .08966 .02528 .00687 .00258 .00179 
2.7 :31312 .22338 .08752 .02352 .00607 .00220 «50150 
2.8 .31942 .22522 .08535 .02186 .00536 .00187 .00126 
2.9 .82557 .22688 .08318°,.02031 .00474 .00158 .00105 
3.0 33157 .22838 .08100 ".01886 .00418 .00135 .00088 
3.1 .83744 .22974 .07884 .01750 .00369 .00114 .00073 
3.2 .84317 .23096 .07670 .01624 .00325 .00096 .00061 
3.3 .84878 .23204 .07458 .01505 .00286 .00082 .00051 
3.4 .85427 .23301 .07249 .01394 .00252 .00069 .00043 
3.5 .85964 .23386 .07042 .01292 .00222 .00058 .00036 
3.6 .36490 .23460 .06839 .01196 .00196 .00049 .00030 
3.7 .37006 .23524 .06639 .01107 .00172 .00042 .00025 
3.8 .97513 .23579 «06442 .01026 .00151 .00035 .00020 
3.9 .38009 .23625 ".06251 .00949 .00133 .00030 .00017 
4.0 .38497 .23662 .06063 .00877 .00117 .00025 .00014 
4.1 .88976 .23692 .05879 .00812 .00102 .00021 .00012 
4.2 39446 .23714 .05700 .00750 .00090 .00018 .00010 


272 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1054 


half of the successive differences of the probability function calculated 
to 9 decimals and attributing them to the midpoints of the intervals, 
Graphs 1 and 2 show a wedge diagram obtained from Table 1 and the 
continuous distribution traced on linear and on aequiareal polar scales, 
Since the observations can only be traced in the wedge form, while the 
theoretical distribution can also be traced as a continuous curve, it is 


GRAPH 2 
CIRCULAR NÖRMAL DISTRIBUTION geo. 
ANo DIFFERENCE OF PROBABILITIES Ag 6) 
TRACED ON AN EQUIVALENT POLAR SCALE 
K=0.5 


desirable to compare the appearance of one and the same distribution in 
the two forms. The graphs also show the type of deviations that are 
inevitable if an observed wedge diagram is compared to a continuous 
distribution. With the exception of the modal and anti-modal months, 
the curves representing the theoretical distribution intersect the ob- 
served wedge diagram near the middle of each month and the wedge 
diagram falls short (exceeds) the theoretical curve at the beginning 
(end) of each month. The tracing of A&(a) is a graphical alternative 
to the tracing of the distribution Ф(а). In general it has the disad- 


= ши P — 


A 
j 


4 


APPLICATIONS OF CIRCULAR NORMAL DISTRIBUTION 273 


vantage that the “menths” considered therein do not coincide with the 
months in the calendar. У 

The different analytical and graphical procedures outlined will now 
be applied to numerical data. 


2. GEOPHYSICAL OBSERVATIONS 


The following examples deal with rainfall, run-off and evaporation, 
ie. the hydrologic cycle and temperatures. The amounts of rainfall per 
month form a genuine distribution. For the Esopus watershed which is 
essential to the water supply of New York City, Table 2 shows the 
mean monthly rainfall. The modal (anti-modal) month is written in 
the first (last) line. The second column contains the observed numbers 
in inches, the third introduces the adjustments for length of months. 
The sum of the adjusted frequencies is 30.665" while the observed sum 
for the same months before adjustment is 31.36”. Consequently the 
adjusted frequencies are multiplied by 31.36/30.665 = 1.02263, result- 
ing in the fourth column which conserves the observed total mean 
annual rainfall »—48.91". The square roots of the numbers in the 
fourth column starting with July are plotted in the wedge diagram, 
Graph 3. The variations indicate a systematic circular behavior start- 
ing with a mode in July and decreasing to the anti-mode in January 
with the exception of the rainfalls in November which exceed those in 
October. d 


е 


TABLE 2 а j 
MEAN MONTHLY RAINFALL IN INCHES: ^ 
ESOPUS WATERSHED A 
1 2 3 4 1 2 CAM n 
Mont First Sec. ў First Sec. 
onth Obs. ANE Adj. у Menth Obs. Adr Adj. 
July 4.69 . 4.539 4.04: — — a T 
June 4.54 = 4.54 Aug. 4.58 4.482 4.53 
May 431 4.171 4.27 Sept. 4.50 — 4.50 
Apr. 4.23 — 4.23 Oct. 4.03 3.900 3.99 
Mar, 3.85 3.726 3.81 Nov. 4.28 — 4.28 
Feb 3.05 3.268 3.34 Dec. 3.45 3.339 3.42 
a — АН pees Jan. 3.40 3.290 3.36 
a ON MM Yi DAE caos а и 


Lj 
To obtain the theoretical distribution we first calculate the param 
eters. Formulas (1.3) and (1.4) lead; from Table 2, column 4 to the sums 


12 è n 12 
> cos a, =° 3.62553; > sin a, = 0.40142. 
1 1 


LÀ е 


274 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 11 


Consequently 


"t 


tan ао = 0.11067; ao = 6°. 


July. It is due to the fact that the frequencies for the latter part of the — 


GRAPH 3 
MowruLY Mean RAINFALL ESOPUS WATERSHED 
5 


к Ї 
year show а slight predominance over the first part (20.72 against 
20.19"). Formula (1.5) leads to t 


а = 3.64766/48.91 = 0.07458. ° 
Table II gives the values jn » 


d —0.07, k= 0.1403; а= 0.08, Е = 0.1605. 


APPLICATIONS OF CIRCULAR NORMAL DISTRIBUTION 275 


Linear interpolatiom leads to k=0.1495. It is sufficiently accurate to 
choose k=0.15. This small value of the parameter is due to the fact 
that the differences between the frequencies are quite small. 

From Table III we obtain radii vectors V(o) /12/n by interpolating 
between k=0.1 and k=0.2., These values are given in Table 3, column 
2. Multiplication by +/48.91/12=2.0189 leads to the radii vectores 
(column 3), which are plotted on the original grid of the polar paper 
to the left and to the right of the modal value at 21 July. The fit of the 
theory to the observations leaves nothing to be desired. 


. TABLE 3 
CALCULATION OF RADII VECTORS 


1 2 3 

Angle V (a) /12/% 5 yla) 
0° 1.075 2.170 

20° 1.070 2.160 

40° 1.051 2.122 

60° 1.035 2.090 

80° 1.012 2.043 
100° 0.9840 1.987 
120° 0.9603 1.939 
140° 0.9414 1.901 
160° 0.9293 z 1.876 
180° 0 


.9252 Д 1.868* 


=e 


The numbers of occurrences of rainfall of Y or more per hour also 
form a genuine distribution. The observed numbers taken from Dyck 
[7] for 156 stations in the United States, 1908-37, are given at the 
bottom of the linear Graph 4. The sim of the adjusted frequencies is 
4366.5 whilé-Xhe correspondimg sum without adjustment is 4502. 
Multiplication of the adjusted numbers by 1.03103 conserves the ob- 
Served total sum 7235. The twice adjusted numbers are given at the 
top of Graph 4. The adjustments are less than 1% except for February 
where the observed value is decreased by more than 10%. The mode in 
July is more than 14 times the minimum in January. Therefore, the 
parameter Ё is relatively high, k= 1.34. The calculation of the param- 
eters in this and all following examples is given in Table 9. The dis- 
tribution is slightly skewed to the right; consequehtly, the mode is 
shifted from 15«o 18 July. н 

The theoretical distribution igsobtained by the same procedure as 
before. We interpolate 1/(2)4/12/n from Table III, multiply these 
Volumes by y; n/12=24.554 to obtain V(a). The squarés of these num- 


4 


е 


` 276 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


bers are plotted in Graph 4 and traced in a continuous curve. 

The monthly runoff in inches for the watershed of the Derwent River 
at Yorkshire Bridge, Derbyshire (England) taken from [14], p. 397, 
is given in Table 4 for the 43 years June 1905 to May 1948. The monthly 
percentages of the annual runoff form a distribution. Therefore the 
monthly runoff can be analyzed as a cyclical variate. 


GRAPH 4 
NUMBER оғ ÜOCCURANCES OF RA INFALL, OnE 
(мси oR More PER Hour - U.S. А, /908-37 


The second column gives the runoff for the 43 years. The third 
column gives the monthly means adjusted forlength of months, the 
fourth conserves the observed yearly sum n — 36.28”. The distribution 
is slightly asymmietrical since the mode is in January while the anti- 
mode is in June. The observations and the theory are traced in equiva- 
lent polar and in linear scales in Graph 5. The fit is quite satisfactory, 
since the deviations between theory and observations do not show any 


APPLICATIONS OF CIRCULAR NORMAL DISTRIBUTION 277 


systematic trend. Thè authors of the article [14] assumed that the dif- 
ferences between the months are due to chance. This opinion implies a 
uniform distribution of the monthly runoff which seems highly improb- 
able, to say the least. 

The circular normal distribution cannot be applied to rivers with 
two regimes, one derived from rainfalls, say in Autumn, and the 
other from the melting of snow, say in Spring. These conditions lead 
to a bi-modal distribution. 


TABLE 4 
MONTHLY RUNOFF IN INCHES: DERWENT RIVER 1905-48 


pel 2 3 4 1 2 3 4 
Month Sum Runoff NU Month Sum Runoff s 
First Sec. 5 First Sec. 
Adj. Adj. Adj. Adj. 


Jan. 210.29 4.733 4.81 Dec. 191.47 4.309 4.88 
Feb. 171.87 4.282 4.35 Nov. 185.17 4.306 4.31 
Маг 147.85 3.327 3.38 Oct. 142.28 3.202 3.26 
Apr. 106.84 2.485 2.49 Sept. 91.57 2.130 2.18 
May 78.85 1.775 1.80 Aug. 86.54 1.948 1.98 
June 67.27 1.564 1.56 July 80.09 1.802 1.83 
O a E O ырш 


The monthly evaporations from a reservoir do not form a distribution 
but they may be analyzed as the preceding examples for the reasons 
given above. Singe the evaporations depend upon temperature, the 
Maximum occurs in summer and the minimum in winter. The observa- 
tions in inches for Yuma, Arizona, taken from the Hydrology Hand- 
book [9] p. 127, and the twice adjusted data are given in the linear 
Graph 6. The mode occurs in*July and the anti-mode in January. 
However, the evaporations in the second half of the year are slightly 
Stronger than in the first one. Consequently there is a slight shift in 
я mode to 21 July. The theory leads to а very good fit to the observa- 
ions, 

„Тһе mean monthly temperatures at a given place do not constitute a 
distribution, since temperatures cannot be added. However, it is cer- 
tainly legitimate to trace them as a function of time. This is done in 
Graph 7 for the mean monthly temperatures in Boston (Mass.) taken 
from Conrad [2] p. 110. A reduction for months of equal lengths 
does not seem to be necessary. The circular distribution, used to repro- 
duce the data, gives an excellent picture. 


e 
. 


e 


278 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


Спарн 5 MowrHLY Runore -DERVENT RIVER (ENGL AND) 
15. 1904-48 


ХАЛШ 
| lA 
A 
; 4+ 
|| Ост fee | Men 


We conclude from the previous examples, which are illustrative of & 
wide range of problems, that the circular normal distribution is an effi- 
cient tool for the analysis of meteorological phenomena, even in cases 
which are not actually distributions, 


PRUNE IN INCHES 


APPLICATIONS OF CIRCULAR NORMAL DISTRIBUTION 279 


/5i 


f NI 
7 
^ ) 4 $ 
ur 6.4 | 8.0 | 9.7 | U5 |1331 128| 107| 9 | 64 [ 
(46) (6,5) (98) « |04 |029) (8.0) 


Jan Feo Мат App May June Jury Avo Зерт Ост Nov Dec lan 


Шш 


EVAPORATION IN INCHES 


280 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


3. APPLICATIONS TO VITAL STATISTICS 


The percentage of persons who die each month forms a distribution, 
Since the population is growing in most countries it is customary in 
vital statistics to present instead the death rates, ie. the number of 
deaths divided by the time lived through by the respective popula- 


GRAPH 7 
MONTHLY TEMPERATURES -BosroN (Conrad p IIO) 


APPLICATIONS OF CIRCULAR NORMAL DISTRIBUTION 281 


The monthly meaps of the observed and adjusted death rates in the 
United States, September 1946 to August 1951 are given in Table 5, 
where the mode and anti-mode are given in the first and last lines re- 
spectively. 

. TABLE 5 
DEATH RATES U. 8. A. SEPT. 1946-AUG. 1951 


1 2 3 4 1 2 3 4 5 
First * Bec. First Bec. 

Month Obs. Adi. Adi Month Obs. Adj Adj Theory 
Feb. 10.66 11.42 91.64 = — — — 10.82 
Jan. 10.50 10.16 10.35 Mar. 10.80 10.45 10.65 10.68 
Dec, 10.42 10.08 10.27 Apr. 10.24 — 10.24 10.81 
* Noy, 9.70 9.70 May 9.60 9.29 9.46 9.81 
Oct. 9.42 9.12 9.29 June 9.42 — 9.42 9.34 
Sept. 9.08 — 9.08 July 9.18 8.88 9.05 9.00 
-= — — — Aug. 8.96 8.67 8.83 8.88 


The two adjustments leave the sum at n— 117.98, shift the mode 
from March to February and leave the anti-mode in August. Thereby 
the distribution is made more symmetrical. The square roots of the 
corrected rates are traced on the polar Graph 8. 

It is sufficiently accurate to use k=0.1. This value is rather small 
because the monthly variations are also small. Since both trigonometric 
sums are negative, the mode obtained from tan о02 0.58487 is a = 210°. 
The mode on 15 February is identical with the observed mode. There- 
fore, we may use the differences«of probabilities Ad(o) in Table 1 to. 
obtain the theoretical rates. The values 117.98A&(o) are given in the 
last column of Table 5. The square roots of these values are plotted on 
the Graph 8. The agreement betweenstheory and observation is quite 
close. No systematic deviations are visible except that the theoretical . 
Mode is slightly smaller than the observed one. 

Infant death rates are a very important part of vital statistics since 
they reflect Social Hygiene. These death rates are proportional to the 
humber of infant deaths for a stationary population and can therefore 
be analyzed by the circular theory. The mean-rates for the different 
Months twice adjusted are shewn at the bottom of the linear Graph 9. 
The mode and the decrease of the frequencies toward the anti-mode are 
quite systematic: the monthly variations are very small, Consequently, 
ut value of the parameter k also turns out to be very small and the 


deoretical distributions traced Ф Graph 9 resemble а uniform dis- 


bution, 
e е 
e е 


Ф 


282 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


The cyclical variation of the number of deaths over the years is a 
very complex result of different causes of deaths having maxima at 
different times. For example, influenza has a mode in January, pneu- 
monia in February, scarlet fever and meningitis in March, measles in 


GRAPH 8 
DEATH RATES U.S.A. SEPT. 1946 ro Avcusr /95/ 
K- 


CORRECTED OBSERVATION 
T7T-----7HEORY 


April, dysentery in July, typhoid fevér in August, poliomyelitis in 
September, diphtheria in October, and tularemia in December. The 
day with maximum probability of deaths should be considered as 
critical. The popular idea of the existence of critical days for certain 
illnesses is completely legitimate, > 


ү 


Lo COEM ——— ye 


APPLICATIONS OF CIRCULAR NORMAL DISTRIBUTION 283 


The mean percentages of deaths of children under 1 year of age in the 
United States, August 1947 to July 1951, due to pneumonia and in- 
fluenza are given in Table 6. These percentages are proportional to 
the number of deaths from these causes for a constant number of 


. 


GRAPH 9 
INFANT DEATH RATES U.S.A. SEPT. 1945 то Aue. 1951 


«5*2 FEB. 
K=0.084 


OBSERVATIONS 
————THEory 


e PS 
2 а 
N |30.72| 3099 |3220|2372|3460|3744 | 3227| 22:27 3/40 7060 29. 
Sex| Ост: | Мом. | Dec.|Jan,| Fee. | Mew. | Aen. | May | June |Јогу |А 
e 


children who died. Therefore they may be represerfted by the circular 
distribution. The sum of the observations is n= 114.27; the sum of the 
Corrected data is n’= 113.32. Sigce the second adjustment would have 
amounted to less than. 1%, it was not applied. 


25 


Ы [4 


284 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


TABLE 6 0 


PERCENTAGE OF DEATH OF CHILDREN DUE TO PNEUMONIA 
AND INFLUENZA, U. S. A. 1947-1951 


1 2 3 1 2 3 4 
T ug Theor. 
Month quency Adjusted Month quency Adjusted Rate х 

Observed Observed 

Feb. 16.07 17.22 — — — 15.38 
Jan. 14.49 14.02 Mar. 13.97 13.52 14.27 
Dec. 10.31 9.98 Apr. 1146 11.46 11.60 
Nov. 9.40 9.40 May 8.35 8.08 8.74 
Oct. 7.58 7.34 June 6.05 6.05 6.57 
Sept. 6.02 6.02 July 5.05 4.89 5.33 
aa om = Aug. 5.52 5.34 4.98 


Sum 114.27 113.32 113.33 
| Gea ES I AU e CILE UU TUE mnl з == 


The distribution is not quite symmetrical. The mode is located in 
February but the anti-mode in July (instead of August). Since the 
mode falls on 15 February we use again the probability Table 1. The 
observed and theoretical wedge diagrams are compared in Graph 10. 
The first may be accepted although the theoretical mode falls short of 
the observed one. 

Pearl [12], p. 153 gives the Percentage of total infant mortality due to 
diarrhza and enteritis in the United States for each month, 1933-1935. 
Since these Percentages are proportional to the monthly number of 
deaths from these causes for a stationary population with constant 
infant mortality, they may be treated аз a circular distribution. The 
mean monthly percentages for the three years adjusted first for months 
of equal lengths and then for the conservation of the suin are given in 
the linear Graph 11. The distribution is slightly skew, since the mode 
is in July, whereas the anti-mode is in February. This produces а 
shift of the theoretical mode to the beginning of August. The fit of the 
theory to the observation shown in the circular Graph 11 is again very 
satisfactory, 

The observed and twice adjusted numbers of persons drowned in each 
month in the United States in 1946 are given respectively in the bottom 


APPLICATIONS OF CIRCULAR NORMAL DISTRIBUTION 285 


The circular charactér of the distribution is thus clear. However, the 
distribution is not quite symmetrical. The mode is in July, but the 
anti-mode is in December. The frequencies for the beginning of the 
year are stronger than for the end. Therefore, the theoretical mode is ' 
at 26 June instead of 4 July where it should be according to popular 
customs. The mode of the theoretical curve is smaller than the observed 


GRAPH /0 
PERCENT OF PNEUMONIA AND INFLUENZA 
IN INFANT DEATHS USA. Aua. 1947 то Jut 1951 


9ne. Consequently the theory gives too large values for August to 

October. These deviations are due to the skewness of the observed 

distribution caused by the heterogeneity in the data: The activities in 

Which drowning occurred in July differ from those in December. THE 

ша in summer come from agother population than those drowned 
winter. e 5 


ЫС е 


286 AMERICAN STATISTICAL ASSOCIATION Ji OURNAL, JUNE 1954 


белн il ) 
PERCENT OF TOTAL INFANT MORTALITY 
DUE TO DIORRHEA AND ENTERITIS 
U.S.A. 1933-55 e K*0.625 


OBSERVATION 
——— Theory 


» 


PERCENTAGE 


/ 


APPLICATIONS OF CIRCULAR NORMAL DISTRIBUTION 287 


GRAPH /2 
DEATHS FROM DROWNING, U.S.A. 1946 
1600 


OBSERVATION 
——— THEORY 
© ° 
/ \ 
f Y 
7 / N 
; N 
/ 
mg 3 
dis Мсн |. APR MAY Ava.| SEPT. 
° 


288 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 
4. APPLICATIONS TO ECONOMIC TIME SERIES 


Other possible applications may be found in economic time series, 
But they entail certain difficulties which must be pointed out. First, 


GRAPH /3 
Monrtiy Eaa Prooucrion PER Lavine Hen USA. 1938-40 
K=0.477 OBSERVATIONS 
<7" “THEORY 


— OBSERVATIONS 
7777 FIRST THEORY 
UU" COMPOSITION 


most series of prices, production and trade do not constitute distri- 
butions. Second, economic time series usually contain secular or cyclical 
movements, which the economist would want to eliminate before in- 
vestigating the seasonal movements; but the process of eliminating 
these undesirable influences and expressing the series as a set of sea- 


: 


` 


ATIONS OF CIRCULAR NORMAL DISTRIBUTION ; 289 


ves, as % is often done, leaves the series with less re- 
e to а distribution than it had originally. Finally, the seasonals 
time series have configurations that cannot be adequately 
the assumption of a single circular distribution. These three 
ties are amplified in.connection with the illustrative examples 
ed below. 
monthly egg production per laying hen, Table 7, given by Ken- 
[10], p. 368, is not, strictly speaking, a frequency distribution. 
the eggs produced in a period of years by a single hen, a flock 
or even by all hens consist of discrete countable units, the 
of dividing the number of eggs produced by the number of hens 
ing them conceals the actual egg frequencies, which are not 
le, However, the eggs-to-hens ratio may be regarded as a sort of 
ibution for an average hen or an average flock. The observed and 
justed data are given in Table 7 and traced in Graph 13. 


TABLE 7 
ONTHLY EGG PRODUCTION PER LAYING HEN U. 8. A. 
1938-1940 
2 3 1 2 3 4 5 
| е First Second 
Obs. Adj. Month Obs. Adj. Theory Theory 
. 
17.10 16.91 == — = 17.20° 18.18 
17.00 17.00 June 14.77 14.77 16.14 16.47 
14.90 14.78 July 13.40 ^ 13.25 13.59 12.99 
9.53« 10.43 Aug. 11.77 11.6% 10.74 10.01 
7.70 7.61 Sept. 9.47 9.47 8.48 8.29 


6.67 6.66 Oct. 7.60 7.51 7.18 7.52 


culations lead to a mode on 16 May. We choose instead 15 
difference which is not visible in the graphs. Interpolation for 
in Table 1 leads to the theoretical number of eggs given in 
7, column 4. The fit between theory and observations is good and 
matic deviations exist. ; 
consider only the differences of the numbers given in Table 7, 
1 3 from their minimum 6.033, the sum of these numbers is 
"64. The resulting vector strength obtained from (1.5) is à 
61. The direction of the mode remains unchanged. Table FI 
‘the estimate k= 1.14. The theoretical results 6.03--03.644%(а) 
from Table 1 afe given in the last column of Table 7 and are 


€ 6 


e Й 


290 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


traced in the lower part of Graph 13. The difference in the two repre- 
sentations are not visible in the polar Graph 13. The distribution ob- 
tained by the composition of a uniform and a circular distribution 
does not show a better fit to the observations than the simple pro- 
cedure used before. Š 

The observations in the following examples are treated according to 
the uniform procedure outlined in paragraph 1. 

The series on automobile production [1] is a genuine frequency dis- 
tribution consisting of discrete, countable units—i.e., the number of 
automobiles produced in three years allocated by months. Since these 
are original data, no adjustments have been made for cyclical, secular 
or irregular influences, except those for length of months (column 3). 


TABLE 8 
PASSENGER CARS PRODUCTION 


1 2 3 4 1 2 4 4 
Month Obs. Adj. Кейсі. Month Obs. Adj. Вейс. 
Aug. 1589 1538 1.193 — — — m 
Sept. 1453 1453 1.127 July 1435 1389 1.078 
Ось. 1528 1674 1.144 June 1527 1527 1.185 
Nov. 1251 1251 0.971 May 1197 1158 0.898 
Dec. 1192 1154 0.895 Apr. 1200 1200 0.931 
Jan. 1120 1084 ' 0.841 Mar. 1228 1184 0.919 
= — — — Feb. 985 1055 0.818 


Sum 15695 15667 12.000 


"The mean monthly number of cars produced is 1308. Division of 
column 3 by this number leads to column 4, the sum of which is of 
course 12. H 

The irregular movements are rather pronounced. August is the 
modal month, but October and June exceed September and July, hence 
these months appear as conspicuous valleys. Since n —12 the values 
V(o) in Table III constitute the theoretical distribution. It is sufficiently 
accurate to use k=0.2. The smooth eurve in Graph 14 does not give à 
bad fit and especially the mode and anti-mode are conserved. 

The series on pig-iron production, taken from Croxton and Cowden 
[5], P- 374, consists of seasonal relatives corrected for cyclical, secular 
and irregular influences. The resulting wedge diagram, Graph 15 is, 
therefore, fairly regular. The amplitude is not very large. The anti- 
mode precedes the mode by 5 instead of by 6 months, implying a some- 


: 
| 


| APPLICATIONS OF CIRCULAR NORMAL DISTRIBUTION 291 


what skewed distribution. The mode is well reproduced and on the 
whole the fit is good, since the deviations between theory and observa- 
tions do not show any systematic pattern. 


j GRAPH /4 
PASSENGER CAR PRODUCTION U.S. A. 1948-50 in UNIFORM SCALE 


OBSERVATIONS 
——— THEORY 


Inthe sales of Misses’ coats, 1926-30 [11] p. 393, two modes and two 
antimodes exist, April, October and July, February, respectively. 
Clearly such observations cannot be fitted into the scheme of a single 
circular distribution. [ 


€ 
5. OUTLOOK ў 
Тһе introduction of a new distribution solves certain statistical 
Problems, but at the same time Faises a considerable number of new 
Problems, some of whichsare enumerated below: 
° 


e 


є 


292 


| 
AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 
TABLE 9 $ 
ESTIMATION OF PARAMETERS 


1 Material Symbol From Rainfall Runoff Evaporation Temperature 
2 Place: USA 156 _ Derwent Yuma (Аг) Boston 
Stations (Мавв,) 
3 Date 1908-37 1905-48 
4 Sample Bize n Obs. 7235 36.28 99.9 595.2 
5 Cosine Zit сова», ед. (1.3) 3997.23 — 9.355 26.081 180.19 
6 Sine Z sinay eq. (1.4) 237.31 2.086 2.856 21.58 
7 Tangent tan ae lines 5, 6 0.05904 — 0.223 0.1095 0.1657 
8 Mode a line 7 187 2 Jan. 21 July 24 July 
9 Sum na eq. (1.5) 4004.27 „9.548 26.237 181.96 
10 Vector 
Strength а lines 4, 9 0.553 0.264 0.263 0,222 
11 Parameter k Table II 1.338 0.548 0.545 0.455 
12 Factor vaia line 4 24.554 1.739 2.885 7.043 
13 Table in graph 4 in graph “= 
14 Graph 4 5 6 7 
. Infant 
Death % % 
кошы Praba ee Rate Death | “Paeumonia i Diarrhea 
Rate 
2 Place USA USA USA USA 
8 Date Se 46- Se 45- Au. 47- 1933-1936 
Au 51 Au. 51 Ju. 51 
4 Bample Size n Obs. 117.98 383.97 113.32 101.00 
5 Cosine Zit cosa, Ед. (1.8) = 5.385 15.041 27.232 20.782 
6 Sine Hainas eq. (1.4) — 3.08 — 4.645 —15.507 12.934 
7 Tangent tan as lines 5, 6 0.572 0.298 0.569 0.484 
8 Mode а line 7 15 Feb, 2 Feb. 15 Feb. 11 Aug. 
9 Sum nā eq. (1.5) 6.204 15.746 31.338 29.696 
10 Vector ? 
Strength а lines 4, 9 0.053 0.041 0.276 0.294 
11 Parameter k "Table II 0.105 0.082 0.576 0.625 
12 Factor MV n/a line 4 — 5.657  . 8.071 2.901 
13 Table 5 in graph 6 in graph 
14-&raph 8 9 10 п 
, e Eggs Car Pig 
1 Material Symbol From © Drowning per De Iron 
еп 
К ш е е л кшш уу. ч 
2 Place Ў USA USA USA USA 
3 Date 1946 1938-40 1948-50 1936 
4 Sample Size n Obs. 6632 136 12.00 12.00 
5 Cosine Zi сов ay eq. (1.3) 2398 16.502 1.045 0.3635 
6 Bine :sna, ед. (1.4) —850 —26.915 0.494 — — 0.4756 
7 Tangent tan as lines 5, 6 —0.884 ^ —1.631 0.473 —1.308 
8 Mode а line 7 26 June 16 May 10 Aug. 22 May 
9 Sum na eq. (1.5) 2450, 31.572 1.156 0.599 
10 Vector 
Бава ay lines 4, 9 0.384 .232 0.096 0.050 
11 Parameter k Table II 0.832 du 0.194 0.100 
12 Factor Vaia line 4 23.509 3.367 Di 1 
18 Table ingeaph 7 8 = 
15 Graph 15 


12: 13 14 


APPLICATIONS OF CIRCULAR NORMAL DISTRIBUTION 293 


If the deviations between the observations and a circular uniform 
distribution are small, a criterion for the reality of a cycle is needed, 
see [6]. We want to know whether a mode and anti-mode exist or are 
spurious and due to chance. In statistical terms this means: let the 
population value of the parameter be x (e.g. zero); what is the probabil- 
ity of obtaining a certain value of k or a larger one from a sample of 


GRAPH /5 
PIG IRON PRODUCTION, [936 -UNIFORM SCALE 


OBSERVATIONS 
— — — THEORY 


size n? A similar, but more complicated, problem is to test the differ- 


ences between the modes and'values of the parameter k obtained from 


two samples. Finally a test for the goodness of fit between theory and 
Observations is needed. н 

_ The following systematic deviations between theory and observa- 
tions may exist: the observed values about the mode exceed (fall short 


2 


294 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


of) the theoretical values and show the opposite behavior at the anti- 
mode; the observed mode differs sensibly from the theoretical one ; the 
Observed antimode is not distant 180? from the observed mode. 
Finally there may be two modes and two antimodes. One of the latter 
may not be visible. In all these cases the chance distribution about one 
fixed date does not explain the observations. Then there must exist 
systematic*reasons which have to be investigated from the intrinsic 
nature of the observations. 

The simplest statistical explanation for such divergencies is the 
assumption that two cycles are involved instead of one. Two uni- 
form distributions give again a uniform distribution. The composition 
of a uniform and a circular normal distribution leads, as shown before, 
to a circular normal distribution. The composition of two circular 
normal distributions with the same values о and k leads again to a 
circular distribution with a and k. Finally let о7 аз be the two modes, 
let К, = be the two measures of concentration, let A; and А be the 
relative weights of the two constituents, then the composite distribu- 
tion is 
Дзей ооа (а-а) А зей? cos (a—a2) 

2т14№) a 2a Io(ks) 


If the two modes of the component distributions are sufficiently 
near to each other, the combination may lead to an apparent single 
mode situated somewhere between them. If the two modes are suffi- 
ciently apart, the composition may lead to one mode and one hump, 
corresponding to an inflection point in linear scale, or to two modes, 
shifted of course” from the original modes. The aritimode or anti- 
uides are no longer at a distance of 180 degrees from the mode or 
modes. The location of the modes or antimodes formally obtained by 
oe of ф(о) are complicated functions of a1, аз, kı, ka and 

1; As. 

Graph 16 obtained from Table 1 and traced to linear scale, shows two 
component distributions with modes о and as at 15 July and 15 Oc- 
tober, with parameters kı=.5 and k;—1 and weights A1—3, 4:=2. 
They combine into an asymmetrical distribution with a mode in 
September and a minimum in February. 

Graph 17 traced in aequiareal scale shows the composition of two 
circular distributions of equal weight and with values k=2, centered 
9n 01=0 o;— т together with the two components. , 

The converse problem of separating an observed asymmetrical or 


symmetrical circular distribution into two symmetrical parts presents 
^ 


gla) = 


APPLICATIONS OF CIRCULAR NORMAL DISTRIBUTION 295 


no logical difficulties. However, great analytic difficulties arise from 
the estimation of the five parameters ол, оз, kı, ks, where Ai=A, 
A,=1—A. Karl Pearson [13] has solved the corresponding problem 
within the linear normal distribution. In the general case it leads to 
an equation of the ninth degree. The difficulties are reduced see [3], 
if certain assumptions can be made about the modes or the values of 
the parameters k or the weights. By this procedure asymmetrical 


GRAPH 16 
COMPOSITION OF Two CIRCULAR DISTRIBUTIONS 
«s/5 JULY &z2*/FOcr. 

K, =0.5 Kol 
Аз Д2 
COMPOSITION ——— 


70 


50 
40 
30 


20 


0 


periodic phenomena can be analyzed by the addition of circular 
normal distributions. 

Another method consists of wrapping an asymmetrical unlimited 
distribution around the circle. Even after the above problems are solved 
there will remain many cycles that cannot be analyzed by circular 
distributions, Statistics were not statistics if everything could be ex- 
plained by this method. 

j Acknowledgment: The author is indebted to Arnéld Court (Univer- 
sity of California, Berkeley), David Durand (National Bureau of Ece- 
nomic Research), and J. A. Greenwood (Manhattan Life Insurance 
Co.) for valuable suggestions. • 

Se e 


296 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


[ 
GRAPH /7 M 
COMPOSITION OF Two CIRCULAR DISTRIBUTIONS 


BIBLIOGRAPHY 
[1] Automobile Facts and Figures; 
edition, New York (1951), 5. im 
a2] Conrad, V. A., Methods in Climatology, Cambridge, Mass. (1946), chap 
VIII. 


Automobile Manufacturers Association, 31st 


[8] Court, A., “Separating frequency distributions into two normal com- 
Ponents,” Science, 110 (1949), 24, * 3 


{ 


APPLICATIONS OF CIRCULAR NORMAL DISTRIBUTION 297 


[4] Court, A., “Some hew statistical techniques in geophysics,” Advances in 
Geophysics, edited by H. L. Landsberg, vol. 1, New York (1952), 75-83. 

[5] Croxton, F. F., and Cowden, D. J., Applied General Statistics, New York 
(1940), 374. 

[6] Durand, David, and Greenwood, J. A., “The integral solution of Pearson’s 
random walk problem and related matters,” Annals of Mathematical Sta- 
tistics, 24 (1953), 686-87. 

[7] Dyck, H. D., and Mattice, W. A., “A study of excessive rainfalls,” Monthly 
Weather Review, 69 (1941), 293-302. 

[8] Gumbel, E. J., Greenwood, J. A., and Durand, David A., “The circular 
normal distribution: theory and tables,” Journal of the American Statistical 
Association, 48 (1953), 131-52. 

[9] Hydrology Handbook, American Society of Civil Engineers, New York, 
(1949), 127. 

H0] Kendall, M. G., The Advanced Theory of Statistics, London (1946), 127. 

[11] Kuznetz, Simon, Seasonal Variations in Industry and Trade, National 
Bureau of Economic Research, New York (1933), 368. 

[12] des R., Introduction to Medical Biometry and Statistics, Philadelphia 
1940), 153. 


[13] Pearson, K., “On the dissection of frequency curves,” Philosophical Trans-- 


actions, Royal Society, 185 A (1894), 71-110. 

[14] Thompson, R. W. S., “The application of statistical methods in the deter- 
mination of the yield of a catchment from run-off data,” with a statistical 
note by D. H. Thomson, Journal of the Institution of Water Engineers, 4 
(1950), 397. 


» 
A NEW TYPE OF CONTROL CHART LIMITS FOR 
MEANS, RANGES, AND SEQUENTIAL RUNS 


H. WEILER | 
New South Wales University of Technology, Sydney, Australia 


Since most conventional control chart limits are designed 
so that the ratio of the expected number of false alarms to the 
number of samples tested is fixed in advance irrespective of 
the sample size, the ratio of false alarms to the number of 
articles tested varies with the sample size. In this paper, con- 
trol charts are designed for which the expected ratio of false 
alarms to the number of articles tested is independent of the 
sample size. The power of the control charts in relation to 
sample size is investigated. 


1. INTRODUCTION AND SUMMARY 


HE usual control chart controlling the mean of a normal population 
TJ constructed in the following way: After the mean and standard 
deviation of the population have been reliably estimated, samples of 
fixed size n are selected and their arithmetic means z= Z z/n are cal- 
culated. A chart is then constructed with control limits m t Be/4/n, 
where т and т are estimates of the population mean and standard 
deviation, and B a constant. The various values of z are entered in the 
chart in chronological order, and as soon as one such value falls outside 
the control limits, production is stopped to allow investigation. 

It is customary to ё B equal to 3 or 3.09, irrespective of the sample 
size n. If B —3.09 and if the population mean and standard deviation 
— remain unchanged, an average of 500 samples will be required to pro- 
duce one z value above the upper or below the lower limit. This means 
that an average of 500n articles will be tested for every one false alarm 
raised. The usual practice of setting control limits entails therefore 
that the average number of articles tested between two false alarms 
depends on the sample size n used for the control chart. This is likely 
to be an undesirable feature in quality control, where the cost of in- 
spection is usually proportional to the number of articles inspected. 
The production engineer will therefore be interested in the number of 
articles rather than the number of saxaples tested. 

In this paper, tables are provided for the determination of control | 

» limits for which the average number of false alarms (or Type I errors) 

is a fixed percentage of the number of articles tested, independent of 

the sample size. Curves similar to power curves are drawn for various 
sample sizes, ziving the average amount of inspection in terms of the f 


. 298 


1 


ROL CHART LIMITS 299 


‘amount by which tHe population mean has changed. A similar pro- 
lure is adopted for range charts. 

One important result of the investigations is that the power of a 
mean chart increases rapidly with the sample size. For instance, the 
average amount of inspection required to detect a given change of the 
population mean is in most practical cases about twice as large for 
sample size n —5 as it would be if a chart for sample size n = 10 were 
d (unless the change of the mean is very large). For samples of 20, 
average amount of inspection can, under certain favorable circum- 
псев be as little ав one quarter of the amount required with samples 
This has already been pointed out in two previous papers [2, 3]; 
› results obtained here are even more striking. 

" However, since it is often necessary to maintain small samples in 
"spite of the loss of power, it was shown in [3] that the power of small 
‘samples can be improved by charts using sequential runs. In the 
€ present paper, tables similar to those mentioned above are provided 
for the determination of suitable control limits for run charts. The cor- 
nding curves show that the amount of inspection is greatly re- 
ed, although it is frequently still high compared with the amount 
wired for simple large sample charts. 

Finally, run charts for ranges are briefly investigated. It turns out 
at the power of range charts is practically independent of the sam- 
ze and that the use of runs does not represent an improvement. 


Y 


. 
2. CONTROL LIMITS FOR MEAN MES е 


[Pi is the probability that a random sample meay falls above the 
er control limit, then about 100P samples in every 100, or one in, 


t on the average n/P articles will be tested before an alarm is raised 
е upper limit. ° 

low, let p be the probability that a random sample causes a false 
aim at the upper limit, and let a be the average number of articles 
ted before the false alarm is raised. We have then a=n/p or p=n/a. 
па а are given, p can be determined; and if we assume = to be 
ally distributed, we can determine B such that the probability i is 
hat exceeds the upper coptrol limit m--Be/ Мп, where m is the 
mean and c the standard deviation of the parent p am If, for 
stance, we take n=5 and a=5000, we have p=0.00 
R 09 from a set of normal tables. 
_ *toceeding in this manner forjvarious sample sizes, but keeping 
155000 fixed, we obtain the valnes of p, B, B/ ул shown in Table І. 


e o 


y 1/P samples will fall above the upper control limit. It follows 


and we бш 4 


300 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 10954 


TABLE I $, 
VALUES OF р, B, В/ уп, WHEN a- 5000 


Notation: p —probability that а random sample causes a false alarm; n —sample size; a =average 
number of articles tested before an alarm occurs at the upper (or lower) control limit, if the process 
remains under control. 


n 3 4 5 6 8 10 

p -0006 .0008 .0010 .0012 .0016 .0020 

B 3.24 3.16 3.09 3.03 ^ 2.95 2.88 
В/уп 1.87 1.58 1.38 1.24 a 1.04 0.91 


Similar arguments hold for the lower control limit, so that in each 
case an average of one false alarm above the upper and one false alarm 
below the lower control limit must be expected for every 5000 articles 
tested. In other words, we will be able to test an average of 2500 arti- 
cles before one false alarm is raised either way. 


Control limits for values of a other than 5000 may be calculated in 


the same way. Table II gives values of B/+/n for various values of a 
and for sample sizes ranging from 3 to 50. 


TABLE II 
VALUES OF B/vn FOR CONTROL LIMITS m + (B/Vn)o 
FOR MEAN CHARTS 


Notation: n=sample size; m —population mean; с =population standard deviation; a =average 
number, of articles tested before.an alarm occurs at the upper (or lower) control limit, if the process 
remains under control, de 


Em. » 

ie DOR à 1000 a=2000 a=3000 a=4000 a =5000 
3 1.59 1.71" 1.78 1.83 1.87 
4 1.33 1.44 1.50 1.54 1.58 
5 1.15 1.26 1.31 1.35 1.38 
6 1.02 1.12 1.17 1.21 1.24 
7 0.93 1.02 1.07 1.10 1.18 
8 0.85 0.94 0.99 1.02 1.04 
9 0.79 0.87 0.92 0.95 0.97 
10 0.74 0.82 0.86 0.89 0.91 
15 0.560- 0.628 0.665 0.690 0.710 
20 0.459 0.520 0.553 0.577 0.593 
25 0.392 0.448 0.479 0.500 0.515 
30 6.343 0.396 0.425 0.444 0.458 

> 35 0.306 0.356 0.383 0.402 0.416 
40 0.277 0.324 0.350 0.368 0.381 

І 45 0.253 0.299 |> 0.324 0.340 0.353 
50 ,0.232 0.277 | 0.301 | 0.317 0.329 


Bo ONTROL CHART LIMITS 301 
—— To determine the &ontrol limits m+ (B/4/n)e, we estimate the mean 
^m and standard deviation с of the population and obtain the appropri- 
"ate factor B/ /n from Table II. If, for instance, we want to use sam- 
ples of size n=8, and if we wish to test an average of 2000 articles 
" before raising a false alarm either way (a=4000), we find B/4/n —1.02, 
and the control limits are m+ 1.027. 

3. THE AVERAGE AMOUNT OF INSPECTION WHEN THE 

POPULATION MEAN CHANGES 


If the population mean changes from m to m-Fke (k>0) while с 
remains constant, the variate 2 will have the mean m+ko and the 
“standard deviation c/-v/n, во that the variate 
i $ — (т + Ко) 
смт 
5а standardized normal variate (mean zero айа S.D. one). The prob- 
ability that 2 exceeds the upper control limit m+Bo/+/n is then 


B B 
P=Pr {тет + =} = Pr fe - m — to z 2-м) 


p 


m i em 


$ — m — ke ды 
= Pr E SB MA = Pr {z= B-—kwym). 


0а limit is then A(n) 2n/P, where “ 
n 


9 р]: —Mde * 

u М2т eae e е 
е, For апу given л and B, the average number of articles tested (or 
| "average amount of inspection”) A(n) is a function of k, whose values 
_ Can be calculated with the aid of^a set of normal tables. 

4 In Charts I and II, A(n) is plotted against k for various values of n, 
While a=2000 and а= 5000, respectively. Both charts show clearly the 
“Superiority of large sample sizes over a wide range of k values. For 
“instance, n=10 is more powerful than n=5 for any value of k less 
than 1.2 or 1.3, and n=20 is more powerful than n=5 for k less than 
t 10 or 1.1. In particular, for а = 5000 and k=0.7, the average amount of 
inspection is 80 for n=5, 40 for n=10, and only 30 fo» n. —20. The sav- 
ing of inspection is even greater when samples of 50 are used, but thes 


3 I 


mean will small samples be more powerful than large samples. 


e € 


"The average number of articles tested before an alarm is*ruised atthe ` 


302 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


Снлвт I. Average Amount of Inspection 
A(n) in Terms of k, for ‘Mean Charts with 
a=2000. 


Notation: n=sample size; A(n) average number of 
articles tested before an alarm occurs at the upper control 
limit, if the population mean has shifted from m to m--ke; 
a average number of articles tested before an alarm occurs 
at the upper (or lower) control limit, if the process remains 
under control. 


» 


CONTROL CHART LIMITS 


е 


Cuar П. Average Amount of Inspection 
с AQ) in Terms of k, for Mean Charts with 


а = 5000. 


(Notation as in Chart 1.) 


303 


304 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


In Chart III, A(n) is plotted against k for the fixed sample size 
n=10 and various values of a. They show that we are able to reduce 
the average amount of inspection by making a smaller, that is, by in- 
creasing the average rate of false alarms. The curves show, however, 
that small values of a are useful only for the detection of small changes 
of the population mean. If, for instance, а = 5000 is replaced by а = 1000, 
the average number of false alarms becomes 5 times as large, but the 
number of real alarms when k=0.6 is only doubled. 

With the help of Charts I and II, it will be easy to decide whether 
large or small samples should be taken. If we are mainly concerned with 
the detection of shifts of the mean larger than (say) one standard 
deviation, small samples are advisable. If, on the other hand, small 
values of k are expected, the samples should be large. If, for instance, 
k is expected to lie between 0.4 and 0.8, both charts show that the 
sample size т = 20 is more powerful than n = 10. If, on the other hand, 
k is expected to lie between 0.8 and 1.2 (say), samples of 10 are better 
than samples of 20. Once a and n are fixed, the control limits can be de- 
termined by means of Table II. ; 


4. CONTROL LIMITS FOR RANGE CHARTS 


И а, 22, ` * - , 2, is a random sample of n observations arranged in 
order of magnitude, the variate Ё=х„— т is called the range of the 
: sample. Tectead of the range R, we shall consider the variate w= R/s, 
where c is the standard deviation of the parent population. 
Nö simple expression exists for the probability law ¢,(w) of w, but 
tables have been,prepared for the probability integral 


d 


(5 a) = f e. etw 
0 


when the parent population is normal [1]. This expression represents 
the probability that a random sample of size n has a range less than а 
given multiple W of the population standard deviation c. The prob- 
ability that the range exceeds the value Wo is equal to 1—p,(W). 

Like the control chart for means, most control charts for ranges have 
an upper and a lower control limit. When a sample range falls outside 
the control limits, it is regarded as ap.indication that the standard 
deviation of the population has changed. 

We shall determine the control limits Wis X R X Woo such that the 
expected number of articles tested before a false alarm at the uppe 
limit Wee, or at the lower limit Wie, is equal to a preassigned number 4 
for any sample,size n. 


PIE 


CONTROL CHART LIMITS 305 
¢ 


Cuarr III. Average Amount of Inspection 
A(n) in Terms of k, for Mean Charts with 
п =10. 


(Notation as in Chart I.) 


306 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE i 


Let pi=pn(Wi) and ps— p.(W3) be the probabilities that the range у 
of a random sample falls below the lower and above the upper limit, - 
respectively. As in Section 2, we have then pı=p:=n/a. The values 
of W; and Wz can then be determined easily by means of the above - 
mentioned tables [1]. If, for instance, n—5 and a=5000, we have pı 
=p.=0.001, and the tables provide W;—0.37 and W2=5.48. Similarly, - 
the values of W; and W: may be found for other values of n and aj - 
they are tabulated in Table III. 


TABLE III 


VALUES OF Wi AND W: FOR CONTROL’ LIMITS W;cSRSWi _ 


Notation: n =sample size; R sample range; —population standard deviation; Wie =lower con- ш 
trol limit; Wss =upper control limit; a average number of articles tested before an alarm occurs at the 
‘upper (or lower) control limit, if the process remains under control, 


n a=1000 a=2000 a=3000 a=4000 a =5000 
3 4.64 4.00 5.06 5.18 5.25 
: : 0.06 0.05 0.04 
5.23 5.31 5.40 
0.22 0.20 0.18 
5.30 5.41 5.48 
0.42 0.39 0.37 
5.38 5.48 5.55 
0.62 0.58 0.56 
* 5.44 5.54 5.68 
0.80 0.76 0.78 
5.47 5.58 5.67 
9.97 0.93 0.90 
5.53 5.64 
1.13 1.09 
5.57 5.67 5.78 
1.27 1.22 1.18 


5. ТНЕ засн AMOUNT OF INSPECTION WHEN THE POPULATION | 
STANDARD DEVIATION CHANGES { 


Suppose that the standard deviation of the parent populati 


changes from с to c/—ke, k>1. The variate w'— R/c' has then 
same distribution as previously the variate. =R/c. It follows 


e 


» > 


CONTROL CHART LIMITS 307 
6 


the probability that the range R of a random sample falls above the 
upper limit Woe is 


P 


1—Pr(R < Ww} = 1 — Pr lv « ma) 
; " 


Wz z 
1— Pr lv < т) = 1 -f on(w)dw, 


0 


(5) 


where W = W2/k. 

The average number 6f articles tested before a change of the stand- 

ard deviation from с to ke is detected is, as in Section 3, A(n)=n/P. 

* A(n) depends on n and k and also on the number a defined in Section 
4. Its relation to k for any given values of a and n can be easily obtained 
by means of the range tables [1]. 

In Charts IV and V, A(n) is plotted against k for some values of n 
and a. Chart IV shows that littleis gained by increasing the sample size 
from n=5 to n=10. Chart 5 shows that the amount of inspection may 
be reduced by making a smaller, but, as for mean charts, small values 
of a are useful only for the detection of small changes in the standard 
deviation. 


6, CONTROL LIMITS FOR MEAN CHARTS USING RUNS zi 
<a = 


It was shown in [3] that the power of mean charts for small samples 
can be improved by the use of runs. For such charts, control limits 
m+Bo/+/n are determined, and as soon as ^ successive 2 values fall 
above the upper or below the lower control limit, alarm is raised and 
production is stopped to allow investigation. We shall again determine 
B such that the average number of articles tested between two suc- 
cessive false alarms is independent of the sample size. 

If P is the probability that a random 2 value falls above the upper 
control limit, the average number of samples that will pass before ^ suc- 
cessive 2 values fall above the upper limit is [3] S2 Р-Р · · * 
+P. The values of P for any В, k, and n are again given by equation 
(3). The average number of articles tested before an alarm is raised at 
the upper control limit, is then A(n) =n8. 

In particular, a false alarm is raised when k=0. The probability P 


then becomes s 
е 


(6) У, Liai 
p-r reet 32)аг, 


е e 


308 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


Cuarr IV. Average Amount of Inspection 
A(n) in Terms of k, for Range Charts with 
а 5000. 

Notation: A(n) =average number of articles tested 
before the alarm occurs at the upper control limit, if the 
population standard deviation has changed from о to ke; 
^ and a are defined as in Chart L 


| 


CONTROL CHART LIMITS 
є 


Cmanr V. Average Amount of Inspection 
A(n) in Terms of k, for Range Charts with 
n=10. 


|2000 сонета Chart IV.) 


309 


310 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


and the average number of articles tested before a false alarm is raised 
at the upper control limit is a=ns, where s—p !--p "+ - - - р. If, 
for instance, a=5000, \=2, and n=4, we find p —0.0287. A set of nor- 
mal tables supplies B — 1.90, and B/4/n —0.95 follows. In this way we 
can determine control limit factors B/4/» for various values of }, a, 
and n, as shown in Table IV. 


TABLE IV 


VALUES OF B/4/n FOR CONTROL LIMITS m+(B/Vn)o 
FOR RUN CHARTS 
Notation: n =sample size; m population mean; т —populaton standard deviation; a =average 
number of articles tested before a run of À sample means occurs above the upper (or below the lower) 
control limit, if the process remains under control. 


» n a=1000 | a=2000 | a=3000 | a=4000 | a=5000 
1 1.85 2.00 2.09 2.15 2.20 
4 0.76 0.85 0.90 0.93 0.95 
5 0.65 0.73 0.78 0.81 0.83 
2 8 0.47 0.53 0.57 0.60 0.62 
10 0.40 0.46 0.49 0.52 0.58 
20 0.23 0.28 0.31 0.33 0.34 
1 1.26 1.39 1.47 1.52 1.56 
4 0.49 0.56 0.59 0.63 0.65 
шнде EET 0.41 0.48 0.52 0.55 0.56 
вее 50:38 0.34 0.37 0.40 0.41 
n 10 0x23 0.29 0.32 0.34 0.35 

22 
1 j 0.89 1.01 1.08 1.13 1.16 
4 0.31 0.38 0.42 0.45 0.47 
5 0.25 0.32 0.35 0.38 0.40 
4 8 0.16 0.22 0.25 0.27 0.28 
10 0.12 0.18 0.21 0.22 0.24 
А 


7. THE AMOUNT OF INSPECTION FOR RUN CHARTS 
CONTROLLING THE MEAN 


In this section, we shall discuss power curves similar to those given 
in [3]. The curves shown in this paper differ from those in [3] only in the 
values of B, which here are functions of n such that the average number 
а of articles tested between two successive false alarms becomes inde- 
pendent of the sample size. 

To find the value of A(n)=nS for any given n, ^, a, and k, we deter- 
mine first the value of B/4/n from. Table IV. We then find P by 
equation (3), using a set of normal tables, and deduce S—P---P^ : 


> » 


OL CHART LIMITS 811 


Снлвт VI. Average Amount of Inspection 
A(n) in Terms of k, for Run Charts with A22, 
а = 4000. 


Notation: n =sample візе; A(n) =вуегаре number of 
articles tested before а run of à sample means occurs above 
* the upper control limit, if the population mean has shifted 
from m to m-+ko; a=average number of articles tested 
before а run of à sample means occurs above the upper (or 
below the lower) control limit, if the process remains under 


312 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


Cuarr VII. Average Amount of Inspection 
A(n) in Terms of k, for Run Charts with ^ =3, 
a = 4000. 


(Notation as in Chart VI) 


CONTROL CHART LIMITS | 313 
( 


Ts 
Ат 
4000] 
Снлвт VIII. Average Amount of Inspection 
A(n) in Terms of k, for X1, n=20 апал =3, 
nad. 
2000 (Notation as in Chart VI.) 


G 


314 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


+ +++-+P>. (Actually, to plot the curves, we'found it more con- 
venient to start with given values of P and to deduce corresponding 
values of k, S, and A(n).) 

Chart VI shows that for \=2 it is still advantageous to take a large 
sample size, such as n = 10. On the other hand, n= 20 represents only a 
slight improvement on n» — 10, and this only over a restricted range for 
k(k<0.6). When \=3 is used (Chart VII), the advantage of larger 
samples becomes even less pronounced. However, Chart VIII shows 
that there is still a considerable saving in inspection when a simple 
chart (№= 1) for large samples can be used instead of a run chart for 
small samples. ic 


8. THE USE OF RUNS FOR RANGE CHARTS 


Instead of stopping the production when a single value of F falls 
outside the control limits Ri, R» of an ordinary range chart, we may cal- 
culate a pair of narrower control limits Ry’ , Е,, and stop production as 
soon as ^ successive R values fall above the upper or below the lower 
control limits. The new limits are obtained as in Sections 4 and 6 
(e.g. for n=5, 1 —2, we obtain Ry’ = 4.080). 

Power curves ean then be plotted by the methods described in Sec- 
tions 5 and 7, but results obtained show that the use of runs reduces 
rather than improves the power of the chart. 

аи st 9. REFERENCES 
[1] Pearson, E. S., and Hivtley, H. O., “The probability integral of the range in 
e ofn НЕ АЗОВ froma normal population,” Biometrika, 32 (1942), 
[2] Weiler, H., “On the most economical sample size for controlling the mean of 
a population,” Annals of Mathematical Statistics, 28 (1952), 247-54. 
[3] Weiler, H., “The use of runs to control the mean in quality control,” Journal 
of the American Statistical Association, 48 (1953), 816-25. 


І 


PROCEEDINGS 


AMERICAN STATISTICAL ASSOCIATION 
118TH .ANNUAL MEETING 


SHOREHAM HOTEL, WASHINGTON, D. C. 
DECEMBER 28, 1953 


MINUTES OF THE ANNUAL BUSINESS MEETING 
The meeting was called to order by William G. Cochran, outgoing President 
of the Association. ® 
Report of the Committee on Elections 


* A Report of the Committee on Elections shows the following officers elected 
for 1954: 


President Elect 
Vice President (1954—56) 
Directors (1954-56) 


Representative at Large 
(1954-55) 

District Representatives 
Northeastern District 
Eastern District 


Ralph J. Watkins 
Henry Schéffé 
Jacob Marschak 
Donald C. Riley 


Daniel B. deLury 


Chester I. Bliss 
George Garvy 


Frank A, Hanna 
Paul R. Sao ec MAE eiae 
William Keste 

John C. McKee 7 
Report of the Board of Directors for 1952 * 


The Report oF the Board of Directors was read and accepted. The Report is 
published separately following the Minutes of this Meeting. 


Southeastern District 
North Central District 
South Central District 
Western District 


Report of the Secretary-Treasurer for 1958 


A 

Samuel Weiss read the Report of the Secretary-Treasurer for the year 1953. 
The Report was accepted. The Secretary-Treasurer's Report is published sep- 
arately following the Minutes of this Meeting. 


Report of the Committee on Resolutions 


The Committee on Resolutions presented the following to the membership 

for their consideration: 

1. Resolution regarding the Pogram Committee i 
RESOLVED that the members and officers of the American Statistical 
IEEE express deep appreciation for the excellent program prepared, 
by members of the Program Committee under the leadership of Herbert 
Solomon, Chairman. 

А ре До. Local Arrangements Committee and the Wash- 
ington, D. C. Chapter. 19526. 


315 е А 


316 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


RESOLVED that the members and officers of the American Statistical Б 
Association express their profound appreciation to the Local Arrangements 
Committee under the Chairmanship of Donald C. Riley and to all of the 
individuals of the Washington, D. C. Chapter for their outstanding work 
and hospitality in connection with the arrangements for the 113th Annual 
Meeting of the Association. » 

The resolutions were approved. 


Report on the Schedule of Forthcoming Meetings 
The forthcoming meetings of the Association are scheduled as follows: 


1954—Annual Meeting—Montreal, Canada—September 10-13, 1954 

1954—Regional Meeting—San Francisco Regional.Conference—Berkeley, Cali- 
fornia—December 1954 

1955—Annual Meeting—New York City—December 27-29, 1955 


There being no new business, the meeting was adjourned. 


REPORT OF THE BOARD OF DIRECTORS, 1953 


Activities of the Association during 1953 were vigorous and widespread in 
scope, This was coupled with a most successful financial year. Details of finance 
and membership will be found in the Secretary-Treasurer’s Report. 


1. Sections and Committees 


The Board has granted sectional status to the Social Statistics Section, for- 
merly the Committee on Statistics in the Social Sciences. This Committee re- 
quested sectional status at the Meeting of the Incoming Board and Council in 
December, 1952. The matte: was referred to the Committee on Committees, 

"which, alter révicyte@ the final charter of the proposed section, recommended 
to the Board that appi;yal be given. The approval of this Charter makes this 
the fourth Section of the ‘Association; the other three being the Biometric Sec- 
tion, the Business and Economies Statistics Section and the Section on the 
"Training in Statistics. 

"These four Sections and the Committee on Statistics in the Physical Sciences 
have been extremely active in the formulation and planning of the Annual Meet- 
ing programs, as well as the planning for a number of successful Regional Meet- 
ings. Without any question, the formation of the Sections has been extremely k 
beneficial to the Association. The variety of interests of the membership of the 
Association is best served by the activity of these Sections. 

Two ad hoc committees were constituted by the Board this year. They are the 
Ad Hoc Committee on Publications Policy and the Ad Hoc Committee on Sta- 
tistical Standards and Organization. Both groups were asked to make their 
Reports to the Board so that action may be taken on their recommendations in 
these two important areas. Š 

The Board wishes to commend the 1953 Program and Local Arrangement 
Committees. The Former, with Herbert Solomon as Chairman, has done aD 

eoutstanding job of presenting a well-balanced, varied group of sessions for the 
Annual Meeting. The Local Arrangement Committee, chaired by Donald Riley, 
has worked very hard to ensure a successful meeting. Their efforts for ASA, in 
cooperation with the other societies meeting jointly, have produced a memorable | 
convention. ano) Dope і 


PROCEEDINGS OF THE 113TH ANNUAL MEETING 317 
7 


2. Abstracts 


This year for the first time the Association has published abstracts of papers 
presented at the previous Annual Meeting in the Journal of the American Sta- 
tistical Association. The editor of these abstracts was Arman Alchian. The 
Board is very pleased to announce that Professor Alchian has agreed to continue 
in this position for the coming year. It is hoped that, with cooperation from per- 
sons presenting papers at this Annual Meeting, the abstracts will appear in an 
early issue of the 1954 volume of the Journal. 


8. New Constitution s 


The new Constitution, approved by the membership in a mail ballot in 1953, 
will go into effect January 1, 1954. The changes incorporated into the new version 
are, for the most part, procedural rather than substantive, and should facilitate 
the operation of the Association. The draft version, which was ratified by the 
membership, was printed in the June-July, 1952 issue of the American Statis- 
tician. 


4, New Chapters ° 


During 1953 four new Chapters were granted Charters by the Board of Di- 
rectors. The new groups are located in Tulsa, New Orleans, Puerto Rico and 
Milwaukee. The Board extends a welcome to these new Chapters. 

The total number of active Chapters has now reached 31. Chapter meetings, 
often in cooperation with local chapters of other societies, offer a wide variety 
of topics to the membership in the fields of statistical interest. 


5. Conferences and Meetings 

During 1953 the Business and Economie Statistics Section sponsored two 
regional conferences. The first was held on April 30 and fay "in cooperation ^ 
with the Graduate School of Industrial Administration; the Carnegie Institute 
of "Technology and the Pittsburgh Chapter of the American Marketing Associa- 
tion and the Ameriaan Society for Quality Control. The Conference was devoted 
to modern statistical methods in business and industry. 

The second Conference, sponsored jointly with the Wharton School of the 
University of Pennsylvania, within its theme of business statistics was divided 
into three main sessions—on capital outlays, production scheduling and sales 
forecasting. This Conference took place in Philadelphia on June 11 and 12. Both 
Conferences were well attended and very successful. 


6. New Monograph 


Following the Association’s policy of a more vigorous program of publication, 
the Board has voted to publish as a monograph the complete Report of the ASA 
Committee to Advise the National Research Council Committee for Research 
b m е Problems of Sex. This Repert, dealing with the volume on the human male 
entitled "Sexual Behavior in the Human Male," by Professor Kinsey and his 
Associates, will be available in 1954. Certain sections of the Report are also being 
Published as articles in the Journal of the American Statistical Association. < 


7. New Appointments of ASA Representatives 
E Samuel 8. Wilks was elected by the Council to continue tg serve as the ASA 
*presentative to the Social Science Research Council for a three-year term. 


" е 


318 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 
) à 


The National Bureau of Economie Research announced the resignation, due 
to illness, of Frederick C. Mills as ASA Representative to its Board of Directors. 
The Board has appointed W. Allen Wallis to complete the unexpired portion of 
Dr. Mills’ term. A resolution was received from the National Bureau of Eco- 
nomic Research which expressed deep regret for the necessity of Dr. Mills’ 
resignation and thanking the ASA for his valudble services during his tenure of 
office. 


8. Journal of the American Statistical Association 


The Board approved an increase in the budget for the Journal for 1953, with 
the result that more articles were published in the 1953 volume, and the number 
of pages increased by approximately 25 per cent,over 1952. In addition, the 
Board approved an increase in the funds for editorial assistance, which heretofore 
has been supplied by the University of Chicago. 


9. Reduction in Dues for Foreign Members 


The Board voted to reduce the dues for persons residing outside North Ameri- 
ca, beginning in 1954. The reduction from $8.00 to $5.00 yearly will make it 
easier for many foreign members in terms of dollar exchange. It is expected that 
the slight loss in income from this reduction will be more than offset by the in- 
crease in new members outside North America. A 1954 drive for new foreign 
members is planned. 


10. Future Meetings 


Future plans of the Association call for the 1954 Annual Meeting to be held 
in Montreal, Canada, on September 10-13. Two Regional Meetings are also 
being: planned for 1954. The Chicago Chapter, in cooperation with other Chap- 
“ters in its District ^s begun work on a meeting to be held in the spring, prob- 

ably during iy ЫА Regional Meeting has been scheduled for San 
Francisco in December of 1954, in conjunction with the American Association 
for the Advancement of Science. Final dates and headquarters for this conference 
have not yet been chosen. Advance publicity will be issued for both of these 
Regional Meetings. 

The Montreal Meeting was planned for September to leave Christmas week 
free for members who usually attend annual Meetings, and to attract those who 
do not attend late December Meetings. In this sense it is an experiment designed 
to guide the Association in the selection of other meeting dates. The widest under- 
standing of the nature of the experiment will help its success. 


REPORT OF THE SECRETARY-TREASURER, 1953 


A drive for an increase in new members which started in 1952 continued ас- 
tively in 1953. This, together with a policy of careful economy, has resulted in 
increasing the Association’s surplus by a wider margin than ever before. a 

Income for 195% was budgeted at $52,650. The actual total 1953 income із 
$60,377.31. This difference is due to a rise in income of most budgeted items, 

ut primarily results from increases in receipts from membership dues, sales of 
publications and subscriptions. Expenses were budgeted at $50,612. The actual 
total 1953 figure is $50,433.83. All expenses havs been kept close to budget 
level, and small savings have been made on a numbor of different items. Thus, 


PROCEEDINGS OF THE 113TH ANNUAL MEETING 319 


t 
the Association shows an increase to surplus of $9,943.48 at the end of 1953. This 
brings the total surplus to over $26,000, about halfway to the goal of a surplus 
equal to one year's income. 

For the second successive year the number of members has shown a significant 
increase. At the beginning of 1953 the membership totaled 4,655. The number of 
new members for 1953 was 639, and 25 others reinstated their membership. At 
the end of 1953 approximately 400 members have been dropped from the rolls 
because of resignation, death or non-payment of dues. Thus, the net membership 
growth for 1953 is 264, and the Association starts 1954 with a total of more than 
4,900 members. This is the highest level of membership ever reached by the 
Association. It is expected that the drive for more foreign members, as mentioned 
in the Board of Directors' Report, will add substantially to the membership in 
1954. 

Subscriptions to the Journal of the American Statistical Association have also 
sbeen increasing. At the end of 1952 there were 1,248 subscribers to the Journal, 
while at the end of 1953 the figure had risen to 1,356. This increase is expected to 
continué in 1954. 


Financial Recommendations e 


The Report of the Treasurer, shown separately, emphasizes that 1953 was the 
fourth year in succession in which the Association has accrued surplus. The Board 
of Directors for the past two years has recommended that the surplus be in- 
creased until it equals the income of the Association for one year. At that time 
it is felt that the Association will be in a much stronger position to expand its 
activities on a much wider scale. With this in mind, the Treasurer has planned to 
budget between $2,000 and $3,000 per year for addition to surplus until this 
goal is reached. The proposed income for 1954 js budgeted at approximately 
$55,000, while expense has been calculated at $52,500, leaving. $2,500, for addi- 
tion to surplus. Income has been figured very conservatively, while expense has 
been approximated as closely as possible, with the,expectation that the surplus 
may be somewhat larger than budgeted. 


April 19, 1954 
To the Board of Directors of 
American Statistical Association. 


I have examined the attached financial statements of American Statistical 
Association relating to the year ended December 31, 1953. My examination was 
made in accordance with generally accepted auditing standards and, accordingly, 
included such tests of the accounting records and such other auditing procedures 
as were considered necessary in the circumstances. 

The recorded cash receipts for the year were traced in the deposits shown on 
mE bank statements and the amounts for dues and subscriptions were tested 
with the membership and subscription records. The paid checks were inspected 
and related vouchers tested in support of cash disbursements for the year. The 
bank balances were reconciled with amounts reported directly to me by the 
depositaries and the cash on hand at December 31, 1953 was verified by inspec- 
tion., I did not check the membership and subscription records in detail or maké 
any independent verification of the inventory of old Journals, the office records 
of which are based, in part, on data Assembled in prior years. 

Tn accordance with a yegolution of the Board of Directors, fhe expense incurred 

е 


s е Т 


320 AMERICAN STATISTICAL ASSOCIATION JOURNAL, 


in publishing a directory, distributed to the membership in 1951, is being 
over a three-year period although such costs would appear to be applicab! 
marily to the year 1951. The accounts for the year ended December 31 
reflect a charge of $831.86, representing the allocated portion of the dir: 
expense applicable to that period. 

In my opinion, the accompanying statements present fairly the posit 
American Statistical Association at December 31, 1953, and the results 
operations for the year, in conformity with generally accepted accounting 
ples applied on a basis consistent, except as mentioned in the preceding. 
graph, with that of the preceding year. б 

James G. Jus 


AMERICAN STATISTICAL ASSOCIATION 
BALANCE SHEET 


Assets 
December 81, 
1968 
Cash in banks and on һап@...................... $52,431.05 
Accounts receivable 2,783.14 
Investment in United States Savings Bonds, 

Series G, due 1962, at cost.................. 3,100.00 
Inventory of old Journals, at approximate cost.... 2,137.69 
Inventory of Monograph оп Acceptance Sampling, 

СЕТИВА ВИ 129.24 
Inventory of Emblems, at cost 415.50 
Furniture and fixtures, at cost less depreciation... ù 2,088.78 
Deferred Charges: 

Deferred Membésti Directory ехрепве........ 
ОРАО ssn yess ieee: POPE tees M 945.55 
У p $64,030.95 
Liabilities and Net Worth 
Accounts payable $10,430.60 
Deferred income (collections applicable to subse- 

quent years) 

"Dites dias aed s C Ys $16,827.00 
Subscriptions. HOA ey 5,817.77 
chasse Node АГАТАЙ ante. 466.84 


$23,111.61 


Net Worth: 5 
А Life Membership гевегүе...................... $ 3,579.92 
" Surplus, per statement 26,908.82 


59; = $30,488.74 
o 


$64,030.95 


PROCEEDINGS OF THE 113TH ANNUAL MEETING 


t 
AMERICAN STATISTICAL ÁSSOCIATION 
STATEMENT OF INCOME AND SURPLUB ÁCCOUNTS 


321 


Year ended December 31, 


1958 
Income: х 

Dues—Current уеат......................„-.. $38,607.00 
—Prior year...... 852.00 
Life membership income. (90.13) 
Bubseriptions—Journal. . . . . . ў 10,134.80 
—American бїайзїїсїап.............. 443.08 
Advertising—Journal...... eee nnn 1,415.99 
—American Statistician . 263.97 
Bales—Journal........ 4 es 1,937.13 
» —American Statistician. . 148.02 
—Acceptance Sampling... . 302.05 
—Emblems, less cost of sales 83.28 
—Membership Directory.... 22.50 
—Biometrics . .. .... .... * . 508.37 
—Other...... 45.00 
Mailing list income. аА 911.11 
Interest income.. 5... dee nt ERAS ES 911.84 
Annual meeting—see по{е..................... 1,559.52 

Reimbursement of overhead expenses: 
Bureau of Mines Project 2,207.06 
Miscellaneous. |... cese ener Sex 54.12 


Expense: ° 
Salaries........ ris pt о ioe De > ERIS ses $12,260.07 
Publications—Schedule I. s. 24,968.23 
romotion’: oL CER S ERE E 861.89 
Ent c vie Ses RENS Rene ee 2,400.00 
Travel and secretarial expense 800.83 
Bupplies,: И 2,240.92 
О be, csc 123 1,849.26 
Telephone and telegraph 729.02 
Accounting services... . 970.00 
Committee ехрепве oiei a EE PES 1,269.75 
Annual meeting expense.......... s 645.99 
Miscellaneous expenses—Schedule 1............ 1,437.87 

e — 
$50,423.83 

"Ue of income over expense for the уеаг........ $ 9,943.48 
d: Surplus account, at beginning of уеаг.......- 16,965.34 

Surplus account, at end ofsyear.. . x NS TES $26,968.82 


Note: Includes $303.29 relating to recejpts from 1952 meeting. 


1952 


$37,101.00 
184.00 
166.31 
9,543.00 
428.35 
1,376.75 
217.72 
1,180.84 
82.65 
243.00 
11.00 
45.00 
453.25 
35.75 
704.17 
541.37 


2,000.00 
67.33 


$54,381.49 


$14,716.81 
20,417.77 
702.30 
2,400.00 
1,426.17 
2,484.54 
1,168.45 
563.71 
970.00 
575.50 
550.72 
1,928.83 


$47,849.80 


$ 6,531.69" 
10,433.65 


$16,905.34 


322 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


Schedule I 
AMERICAN STATISTICAL ÁSSOCIATION 
Year ended December 81, 
1953 1952 
Publications: 

Journal—Printing. 77 $12,522.59 

—Abstracts....... г 750.00 
—Hditorial expense.................... 1,215.30 335.98 
—Cost of old Лоиғпаїв................. 188.67 101.70 
—Delivery charges... 4 36.11 32.14 
— Storage charges................... X. 114.00 96.00 
$17,422.85 $13,088.41 
American Statistician. ....... 20... Le a 6,584.28 5,810.75 
Acceptance Ватрпд......................... 129.24 218.61 
Membership Directory........................ 831.86 1,300.00 


a 


$24,968.23 


Miscellaneous Expense: 
тергев опа а d epe ce Mamas Ss $ 617.64 
123.25 
4.50 
314.89 
153.25 


$ 1,437.87 


* А co 


» 


$20,417.77 


$ 555.87 
123.25 
7.84 
559.41 
206.95 
26.91 
448.00 

$ 1,928.83 


— 
= 


t 


SUMMARIES OF PAPERS DELIVERED AT THE 113th 
ANNUAL MEETING OF THE AMERICAN STA- 
TISTICAL ASSOCIATION IN WASHINGTON, 

D. C., DECEMBER 27 TO 30, 1953. 


Edited by Armen А. ArcHIAN, University of California (Los Angeles) 


The present section contains all available abstracts of papers presented at 
the 1953 national meeting of the American Statistical Association in Washing- 
ton, D. C. The sequence of presentation here conforms to a grouping of ab- 


stracts according to the various sessions at which they were delivered. 


PAPERS SUMMARIZED 


ALLISON, Harry, Brinser, AYERS, AND Zwick, CHARLES, An Analysis of 


the Demand for Meat... ej mde «Mijn kis s RU 0100109002) 
Arrow, KENNETH J., Development in Statistical Techniques of the Last Dec- 
ade of Special Interest to Economists: Sequential Analysis . . . |. 328 
Bacuracu, CLIFFORD А., Estimation of Length of Hospital Stay from Dis- 
darge; Data i... Ce ООСН ere a OT 
Baucnorr, T. A., Organization and Functions of a Complete Centralized Sta- 
tistical Center at a Land Grant College кй SORA AUN a ТИЙ, 
Bancrort, Т. A., Preliminary Tests and Pool Rules . 348 
Baranxin, Epwarp W., Theory of Behavior. . . . . . . . . 904 
Barus, Grace E., The Time-interval Approach to the Problem of Contagion 368 
Bratt, Gzorrrey, Further Generalization of Neyman’s Distributions . 868 
BECHHOFER, ROBERT, DUNNETT, CHARLES W., AND SOBEL, MILTON, А Sin- 
gle-Sample, a T'wo-Sample and a Sequential M: ppe Decision Procedure 
for Ranking Means of Normal Populations with Known Variances. . 358 
Brutoc, Nepra B., Validation of Morbidity Survey Data by Comparison 
with Medical Records і Mea НЛО НАЛА Catal Papua vale certo 
eRcER, Martın J., Statistical Problems in Physics. „Ж... 369 
BERKSON, Ji OsEPH, Estimation of the Interval Rate in Actuarial Calculations; , 
B A йш E the Person-Years Concept . . 9... 0. + + + 808 
Ах; . W., Characterization of Distribution-free Statistics . . 948 
PUE Е. P., AND HunteR, J. S., The Study and Exploitation of Response aut 
OPONE RR E Reig ОЕ d Et o p tee D 
ВиЕтвү, Pierre, The Future of Railroad Shares. . . . . . 851 
RINSER, AYERS, ALLISON, HARRY, AND Zwick, CHARLES, An Analysis of 
the Demand for Meat e. = s sis ДЕК ДР МЕГЕ 362 
Broa, ARTHUR L., GERMAN, CLAYTON, TRUEBLOOD, LORMAN, SCHWARTZ, 
im H., Moss, MirroN, Copy, Perer M., Industrial. Production Ae 
Wdem.. 4 о ear i ied д S S УУРА 
Brown, Ernest W., and Wooppury, Max A., Time Series Factor Analysis 
with an Economic Application. . . «+ + + + + + ot on 331 
RY, GERHARD, Trends and Cycles in German Wages. . + + «+ t 334 
poe Еовквт W., The Relation of Census Tracts to the General Census [Hs 
POGAM- ign a C Не РС И ere EUIS 
Burr, Irvine W., Use of Experiments in Engineering Statistics . . « 358 
HERNOFF, HERMAN, AND LiEBERMAN, GERALD J., A Note on the Use of 
c Normal Probability Paper Sa ee Ne Sale rt 
ODY, PETER M., GEHMAN, CLAYTON, TRUEBLOOD, LORMAN, BROIDA, 
ARTHUR L., Scawartz, М. H., AND Moss, MILTON, I ndustrial Produc- 
© tion Index. bad Gite РА aE A Ur RINT 357 = 
OHEN, Bernard M., AND COOPER, MAURICE Z., A Follow-up Study of 
Co Mortality in World War II Prisoners of War. S 5 00886 
LM, GERHARD, The Economic Outlodte for 1954 2 


Connor, W. S., New Experimental Designs for Paired Observations 
e e 
323 
© 


324 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


Coopzr, MAURICE Z., AND COHEN, BERNARD M., A Follow-up Study of 
Mortality in World War II Prisoners of War. . . . . . . E 
CopzrAND, Morris A., Current Problems in Measuring Moneyflows . . 
CORNFIELD, JeRoME, Some Finite Sampling Concepis in Experimental Sta- 
ИЗВ: Ee quM ee UMMC LI E la esed 
Своскитт, JEAN BRONFENBRENNER, Effect of Current Operating Experience 
on the Realization of Investment Plans ... . . .. .. ~, 
Crow, Epwin L., Statistical Determination of Tolerances in Rocket Develop- 
CUTTRELL, FLORENCE S., Uses of Small Area Census Data in New York City 
Cymrr, R. M., The Use of Statistical Techniques in the Aging of Accounts 
ЕС шша а КШ УУ Е ЕКЫ. а о Sense aco 
Dairy, Joun T., Achieving Maximum Prediction per Unit of Testing Time 
DALY, Jossrn F., Survey of the Theory of Finite Sampling > =. о 
Dannocn, J. G., Role of a Centralized Statistical Organization in a Univer- 
it, roni Ms 


QUAS ois CAS SEES Е а а 
Davzn, Ernst A., Limitations of Consumer Credit Statistics 
Davis, D. J., Parametric Estimation of Survivorship. . . . . . 
Dar, В. .› The Statistician in a Research and Development Laboratory 
Dearvorrr, Neva R., Longitudinal Studies of HIP Experience, Back- 
ground and some Findings of Pilot Study. . . . . . . . . 
Т ы ША AND Frazier, Davin, Gasoline Mileage in Winter Day-to- 
od em en A Ma GENUINE ee OE Шо, 
Рві Priore, F. R., AND Kommers, WILLIAMS J., An Example of a Fractional 
Replication in a Bearing Abrasive Wear Test . . . . . .. 
Dovry, Н. M., Union Impact on Wage Structures | | | | | | s 
Dv«aaan, Gzoran, Use of Census Tracts in Study of Changing Residential 
Patterns in Metropolitan Areas. Ch. ait Aine ЕЯ 
Юомунтт, CHaRLES W., BECHHOFER, ROBERT, AND SOBEL, MILTON, A 
Single-Sample, a Two-Sample and а Sei ial Multiple Decision Pro- 
cedure for Ranking Means of Normal opulations with Known Vari- 
ba an SERLO ЙС UA SURE А h 78 RIA TA Sy eh oe 
EcknER, Ross, AND TAEUBER, Conran, The Current Statistics Program of 
СТАО ОРЧ, 
ISNER, ROBERT, -Ezpectations, Plans and Capital Expenditures: yn- 
thesis of Ex Postasd Ex Ante Data . УН 5 is SUNL ИД УЙ, 
Ery, J. EDWARD, Ade cy of International Trade Statistics for Economics 
and Business Analysis. . . . . . ROMO Re ECCE sat chee 
Ennanpr, CARL L., Hospital М. orbidity Reporting—Ezxperiences and Find- 
ings of a Pilot Project ИНЕТТЕ НДИ. TS vire 
Foorn, RICHARD J., AND KuozwETS, GeonaE M., The Demand for Citrus 


, le of а Centralized Statistical Organization in a College 
with Respect to the Reviewing of M. anuscripts nd the Promotion of Sta- 
үш DC EDEN BOM xe D ус 
GarrEY, Wintram R., The Problem of Within Family Contagion . . . 
GEHMAN, CrAYTON, "TRUEBLOOD, Lorman, BROIDA, AnTHUR L., SCHWARTZ, 
М.н, Moss, Мплтом, AND Copy, Parer M., Industrial Production 
on ee 2 uenea Index “of Highwey Ton Miles . . Vari. 
. N., Simultaneous Test of Linear is of Vari- 
Be ернеу Anatule ууш; 

К GiNzBERG, ELI, Scientific and Professional Manpower уч 
‘Givens, Мюверітн B., Tracts in Analysis of Worker Mi obility,. . . - 
Goprars, Narman, Longitudinal Study of Health Insurance Plan of 
СО Y OR S аи рр тоте гооп ш 
GOLDSTEIN, HAROLD, еч Advances in Statisticc on Scientific and Pro- 


Sessional Personnel 


к. 


336 
368 


369 
355 


ж 


‘i С SUMMARIES OF PARTE 


__ GREENBERG, LEON, AND SEARLE, ALLAN, The New Bureau of Labor a 
tistics Indexes of Productivity in Manufacturing . > 
 Grreves, Howard C., “Spot Checks” in Lieu of Com lete Censuses if 
_ Gairriras, J. C., Analysis of Variance Models in Sedimentary Petrology . 
GULBRANDSEN, R. A., AND McKeLvery, V. E., Problems in Samning the 
Phosphoria Formation 
Hansen, Morris H., AND Hunwrmz, WrinniM N, Developments 1 in Collec- 
lion and Processing of Mass Statistical Data . 
Enn Herman O., Approximate Tests for Comparisons of Rank Corre- 
ations . . 
— HerczR, Benet, International Criminal Statistics —. 
HoADLEY, WALTER E., Jpg., Inadequacies of the Construction Estimates as 
General. Economic Measures. T 
Hopazs, J. L., The ТАМЫ Down Method with Small Sam 
Horvitz, DANIEL G., On Respondent-Nonrespondent Di, per Observed 
in the Pittsburgh Morbi ty Surveys. 
Hovsnxouper, А. S., AND Kimpaut, A. W., A Stochastic Model for the Selec- 
tion of Macronuclear Units in Paramecium Growth . 
е . 8., AND E G. E. P., The Study and Exploitation of Response 
eguons . 
HuNTSBERGER, D. Y. An Extension of. Preliminary Tesls for Pooling Data 
Horwitz, WILLIAM X, AND Hansen, Morris H. аерата in Collec- 
tion and Processing of Mass Statistical Data . 
 Ноттох, Tuomas G., Methodological Problems and We. of Study of 
d of Old- "Age Assistance . . 
Jones, Homer, Recent Revisions of Consumer Credit Statistics 1 
JoNEs, HowanD L. ‚ Optimum Cluster Size — . 
_ ‘Karz, Luo, Probability Distributions of Group Organization Theory. 
— Kamar, A. W., AND HousEHOLDER, А. 8., A Stochastic Model jor the Se- 
lection of M. acronuclear Units in Paramecium Growth 
Kony, Ronznr, Questionnaire Design and Related Methodological Problems 
in the Canadian Sickness Survey . 
Kommers, WILLIAM J., AND DEL PRIORE F. R., An Example of à Fractional 
X eplication in a Bearing Abrasive Wear Test rales 
E WinnrAM H., The Problem of № on normant and, iN. ‘onparametric 
zo Тез. 


> d 
Kuznets, Сбковак M., AND "Роот, RICHARD T The Demand for Citrus, 


Products . 
B Jonn B. perations Research as a Science 
"LAURENT, ANDRÉ, 1 se of Statistics in Engineering in France" 
w, Е. А, - Insurance Mortalit Investigations of Physical Impairments B 
IDBERMAN, GERALD J., AND CHERNOFF, HERMAN, A Note on the Use of 
L Normal Probability ‘Paper. 
Lu TTTELL, ы ARTHUR S., AND SMITH, GzonaE V., Intervals Between Onsets of 
d ultiple Cases of Poliomyelitis ir Families . 
“бокв, MrronELL O., Two Nonparametric Tests Using the Method of Ranks 
Er, for Testing the Randomness of Samples Drawn from Finite Populations 
B iss The Flow of Net Cash Macht Through Life Insurance Com- 
wes, 
Loomer, Harran G., Intercensal Needs for Small Area Dala—By a Local 
hi lanning Agency. . : 
pnm, Stoney, The бмр л. Stock Market 
eke J OHN, Statistical Principles of Testing - КУН peices dla ди 
ланди, , HERBERT, Problems of Co-ordination in the Canadian Statistical 
ystem . 
 Маъвев, ЕввріхАмр F., How the Automobile Industry Utilizes the Census 
May. raci in M. arket Determination 
Мм ER, HERBERT, A., Staffing a Central Statistical Organization 
crey, Ray, On Testing the Association of Mineral Occurrence with a Set 
E. of Observable Characteristics. — ДУ 
м LLER, Herman, P., Some Observations on the I neguality of р of I nome 
NTGOMERY, Donormuy S, Uses of Census | Tracts Housing 


326 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


Moonan, Уїплллм J., Multivariate Analysis ч Солано for a Latin Square 
Moonz, Еверкніск T., Some Statistical Evidence of Economies of Scale. 
Monznovss, M. Durton, Trends in Mi onetary Structure in 1954 . . 
MORGENSTERN, OSKAR, Acouracy of Foreign Trade Statistics 
Мовтом, J. E., The Problem of Im; oving Mineral Statistics 
Moses, L. E., Some Comments on the Lot Plot Plan. . . | | бз, 
Mossman, Аск, A Two Sample Procedure for Linear Discrimination in 
Normal Samples . SUNL OA MD SK Lis wie рт ш 
Moss, MILTON, GEHMAN, CLAYTON, TRUEBLOOD, Lorman, Ввотрал, 
AnTHUR L., M. Н. SCHWARTZ, AND Copy, Ретев M., Industrial Pro- 
duce Indes CREE Urano eo iocis 
Munazrr, B. D., The Function of the Outside Consultative Committee in the 
Revision of Governmental Statistics СТ сэ ч icio go ve. Lir ani 
Munno, S., Some Aspects of Sequential Experimentation S 
Myers, Perry H., Use of Census Tracts for Business Analysis . . , 
Myers, Бовквт J., Factors in Interpreting Morality After Retirement 
MocKzLvEY, V. E., AND GuLBRANDSEN, R. A., Problems in Sampling the 
Phosphoria Formation . . . . . . TVET pot Tos esc ти 
Метев, Јонм, Problems in Experimenting with the Application of Statistical 
Techniques in Auditing. . . . . . . о, . . Рт 
Моттев, С. WARREN, орау Відпезв, and Progress. . . . | | 
Orps, ДР G., Using the Experimental Approach in the Teaching of Sta- 
Айй ede graec ООД Ce P vv WOLF alae н 
Рниллрв, A., The Stability of Technical Coefficients: Evidence from Inter- 
Plant к башы in Labor and Materials Productivity Rotae 
Preston, G. W., Contemporary Topics in Statistical КЮЕ S S 
Ослскемвовн, С. G., Demand Analysis from the M.S.C. Consumer Panel 
Кир, A. T., Stochastic Processes and the Stud of Growth Phenomena 
Rerp, Manaanzr, Value of Dwellings in Relation toIncome. . . . . 
PU US A., Problems of Coordinating the United States Statistical Sys- 
Комга, Harry С. Probability of Acceptance for Sampling Plans Based on 
Average-Standard Deviation Acceptance Criterion. . . . . . . 
Баана Wer B., The Validation of Testing Programs for University 
Scuwarrz, M. H. G«HMAN, CLAYTON, TRUEBLOOD, Lorman, BROIDA, 
AnTHUR L., Moss, ILTON, AND Copy, Ретев M., Industrial Produc- 
EA tE Dae NT аан оа аа дос ee 
SEARLE, ALLAN, AND GREENBERG, Leon, The New Bureau of Labor Statistics 
Indexes of Productivity in Manu Оаа у тах 
SELLIN, .THonsTEN, Problems and rospects of Criminal Statistics in the 
P OE Ei SCARS STAN TOR AS S a a a Mh M 
Suryocg, HENRY S., JR., Changing Geographic Patterns of Migration in the 
EEN TAE E E T EE E ATEON зп Oe 
Simmons, W. R., Business Uses оў Intercengal Data. . . . . . . 
Бмітн, Guorcn V., AND LITTELL, ARTHUR S., Intervals Between Onsets of 
Multiple Cases of Poliomyelities in Families . . Lac NI Жей АЗ. oe 
Вмітн, Harry, JR., Weighting Coeficients for Age-Adjusted Death Rates 
боввт, MILTON, BECHHOFER, ROBERT, AND DunneTT, Снлвікѕ W., A 
ватане, а Two-Sample and а Sequential Multiple Decision Pro- 
cedure for Ranking Means of Normal opulations with Known Vari- 
CA cR uA TA A a IRL Gea rn PLE уе 
Бтвортвиск, FRED І, Family Interaction and the Transm ission of Achieve- 
ment-related Altitudes d ORE, Le Eae e VR т т К X 1 isha 
Tanuper, Conran, Meeting the Needs for Small Area Intercensal Data 
TAEUBER, CONRAD) AND EckrER, A. Ross, The Current Statistics Program 
CACM Bursat 1.0 Rae ee Program 
NTAYLOR, WILLIAM F., Consistency of Estimators under a Specialized Bio- 
LU ee NNNM NUS eC" 
RUAX, DONALD, An Optimum Slippage T^. 7 ormal 
ОИ eu lananm af K Normal 


349 
349 
352 
350 
338 
341 


330 


SUMMARIES OF PAPERS 327 
f 


TRUEBLOOD, LORMAN, GERMAN, CLAYTON, Brora, ARTHUR L., SCHWARTZ, 
M. H., Moss, Minton, AND Copy, PETER M, Industrial Production 


dndes . . 009 0D REEF oe РААН 357 
Tuxer, Joun W., Unsolved Problems of Experimental Statistics. . . . 348 
WacEnuats, R. E., Demonstration Teaching of Statistics . . . . . 358 
Waris, W. ALLEN, Responsibility of a Centralized Statistical Organization 

in а University for Training in Statistics. . . . . . . . . 333 
Warum, W. ALLEN, Teaching Statistics to Executives . . . . . . 856 
WEINGARTEN, Harry, Continuous Sampling Plans 341 


Wers, Oris V., Economic Forecast of the Agricultural Situation, 1952 339 
WzwTwoRTH, Ерма C., Methodological Problems and Findings of Survey of 


Aged Beneficiaries of,Old-Age and Survivors Insurance . . 342 
WorwaN, Luo, Wages Since 1914 . . . . . . « . . . - « 884 
Woopsury, Max A., Information Theory and Prediction . . . . . 860 
Woopsury, Max A., AND Bgown, Ernest W., Times Series Factor Analy- 

sis with an Economic Application . . 331 


Zwick, CHARLES, BRINSER, AYERS, AND ALLISON, Hanry, An ‘Analysis of 
the Demand for Meat . . . . . ..s « ЖИЕК А MOD 


Scientific and Professional Manpower: the Value and Limitations of a Statistical Approach. Еш GINZ- 

BERG, Columbia University. 

A major shortcoming of American social science is its infatuation with the doctrine that if one 
only had a sufficient number of facts, one could solve any problenf. This particular preconception is 
grounded in the following: (1) Our pragmatic position with its anti-theoretical bias. (2) The availability 
of funds to support expensive data collection undertakings. (3) Our desire to make progress quickly and 
our naive belief that the activities connected with data collecting are proof of progress, (4) Our belief 

, that the “facts” will be able to resolve all difficulties and that there are no underlying value conflict. 

During the past few years much energy has been poorly directed because of an error in strategy: 
We started to collect large bodies of data without knowing what to do with them, and without having 
any clear idea of the key questions they were supposed to answer. Much of this dats collection has been 
spearheaded by interested pressure groups to support a particular point of view. Much of the govern- 
mental and academic effort in the collection of data has been of dubious value. One of the most serious 
Tesults of a preoccupation with statistical tabulations has been the blithe assumption that the concept of 
shortage (or balance) in professional manpower is a simple arithmetic relation between supply and 
demand. Similarly, there has been no proper attention paid to the role ofsubstitution in matters of 
supply. Among the other significant facets of the problem that are not filuminated easily via statistics 
is that of utilization, which can of course greatly influence whether any given supply proves torbe ade» 
quate or not. Entirely too little emphasis has been placed upon the fact that any particular manpower 
problem can probably be resolved in any one of sseries of ways depending іа large part upon the cri- 
teria that are employed. Perhaps the most serious shortcoming of all growing out of a quantitative 
approach is the inherent tendency contained therein to gloss over qualitative differences among pro- 
fessional persons and to see the problem primarily as one of numbers. 

; There are at least five major areas in which organized statistical efforts can make a significant con- 
tribution to the illumination of scientific and professional manpower problems: (1) By providing 
knowledge of the occupational structure; (2) By facilitating studies of the probable size of future sup- 
ply; (8) Through contributing to the organized study of future demand by studying the strength of 
Particular factors that have influenced demand in the previous time periods; (4) By helping to set out 
Ш a systematic fashion the incentive factors in different occupations which influence training and 
distribution of trained persons; and (5) by studies of the flow of individuals in and out of different types 
of employment, the strategically important question of convertibility of trained manpower can be 
illuminated. The rate at which a statistical approach can contribute along the foregoing lines to the 
inderstanding and solution of professional manpower problems will depend very greatly on the extent 
1o Which existing theory can be improved with respect to the concept of balance, the potentialities and 

imitations of “convertibility” in the us@of highly trained persons, the elaboration of the significant 


*elations between qualitative and quantitative considerations, and the factors determining different 
Utilization levels, 


Recent Advances in Statistics on Scientific and Professional Personnel. Нано Gorpersny, Bureau of” 
Labor Statistics, 
Bus Widespread interest in professional persons has focused attention on statistical data in this field- 
ent improvements include (a) iniprovemeát in the frequency, currency, detail, and over-all quality 
зе 


"yr: 


328 AMERICAN STATISTICAL ASSOCIATION J OURNAL, JUNE 194 


of the data on output of trained personnel from the educational institutions; (b) development of tech- 
niques for registering members of professions; (с) active participation by professional societies in 
studies of their fields; (d) a beginning in the study of occupational mobility in the professions, The 
kinds of data we have and the gaps in information reflect the emphasis on large-scale statistical surveys; 
more intensive studies are needed to provide missing data. 

Among the areas of work which should be productive are: I. Development of measures of qualitative 
differences among individuals to temper the purely quantitative information now available. II. Develop- 
ment of surveys of professionals via employing establishments, as an essential supplement to census- 


International Criminal Statistics. Вехот HELGER. 

In spite of repeated efforts over a hundred years to achieve greater uniformity of statistics on crime, 
в direct comparison of national series showing crime rates, and the trend of these rates, is scarcely pos- 
sible. Criminal statistics cannot be taken at their face value, and international comparisons are partiou- 
larly falacious without sufficient knowledge of all particulars, pertaining to the penal and judiciary 
system of countries, that are essential for an interpretation of national figures on crime. A new approach 
to the problem of international comparisons would consist in selecting for this purpose certain actions 
that nre universally recognized as criminal and are of such a nature that they regularly become known 
to the authorities, and redefining these actions so as to give a uniform content to the series that are com- 
pared, Such data, stemming from the police or the investigating authorities, would be valuable as indi- 
eators of the frequency of crime, whereas court statistics, supplemented by information on suspended 
sentences, probation and parole, eto., have their value for studies of the treatment of offenders. Current 
statistica on crime can scarcely reveal the causes of criminality but may serve as a frame for further re- 
search into this subject by giving a general picture of the dimensions of criminality, the incidence of 
various types of criminality within socio-economic groups, and its evalution under the impact of social 
changes, In this connection, international comparisons present a great deal of interest. 


The Study and Exploitation of Response Regions. G, E. P, Box and J. S. Hunter. 
Techniques have been proposed by Box and Wilson for the study of response surfaces. Having 


Mgt ш allow [bees of a polynomial of degree d as efficiently and economically as possible. 


duced. This distribution yields the information per observation obtained at a point on the fitted surface 
At any location in the factor Space. 

It is suggested that a spherical information distribution is desirable (i.e., a distribution such that 
the information is constant on spheres centered at the origin of the design). Designs which give a spher- 
ical information distribution satisfy the criteria that the variance-covariance matrix and the moment 


Developments in Statistical Techniques of the Last Decade of Special Interest to Economists: Se- 
quential Analysis, Kennere J, Annow, Stanford University. d 
bie papet is a survey of the fundamental ideas of sequential analysis and ite implications fot 
economics. The general conclusion is that sequential analysis as а uic Rules is likely to find 
ew applications to econometrics because of the special problems of sampling in the typical problems of 
оле statistics, but that the type of reasoning underlying sequential analysis is likely to be of 600" 
»y siderable importance in the theory of dynamic economics, particularly where actions with random 000 
han the аге Under consideration, The technique of sequential analysis of stat/stical data ia applica 
when the experimenter can control the taking of additional observations on the basis of the'resulta of 
earlier ones. The additional observations are Dossibis at a cost. The technique is therefore not AP- 


OF PAPERS 829 


iy be difficult to enumerate after each observation, or to multi-purpose surveys, where the 
may be different for different questions asked. A brief. development of the theory of se- 
between two hypotheses is given from a Bayesian viewpoint. After a number of ob- 
s, the original a priori probabilities are transformed into a set of a posteriori probabilities, but 
is essentially unchanged, except for the magnitudes of the probabilities, This simple ob- 
enables an immediate derivation of the sequential probability-ratio test, A problem of choice 
tory policies is then stated. Jt is shown that the same type of reasoning used in the analysis 
ntial decision problem leads to a functional equation which can be solved to yield an optimal 
to the inventory problem. 


im of Non-normality, and Nonparametric Tests. Witttam Н. KRUSKAL, University of Chicago, 
lost statistical procedures used in practice аге based upon the assumption of normality, The 
ts usually given in defense of this assumption are stated and critically discussed, A method of 
iting the assumption of normality is the use of nonparametric statistical procedures. The 
and disadvantages of these procedures are discussed in general, and some examples are 
procedures of applied nonparametric analysis are classified, and key references are given, 


са] Index of Highway Ton Miles. JAxtzs P. Оковак. 


investigation has as its objective the construction of a composite weighted cyclical index of 
# operated by trucks and truck combinations on all main and local rural roads in the United 
ө. The index computes for any particular month the cyclical position of highway ton miles with 
the computed normal or expected value as of that month, Polynomial curves of appropriate 
"were fitted to the cyclical-irregular movement of highway fon miles over a 78-month period. 
u affording a test for goodness of fit, include the variance ratio or Snedecor's “Р,” the standard 
of the residual variance, and the appearance of the plotted curve. This fitting (or smoothings 
¢liminates the irregular movements and causes those fluctuations attributable to cyclical force) 
out in bold relief, It should be pointed out that the cyclical irregular movement to which the 
fitted is in terms of composite weighted standard deviation units from “normal,” 
Consequence of this investigation the writer has been able to isolate and to delineate a structural 
of the cyclical movement of highway ton miles which, with varying intensity, tends to repeat 
ipproximately 32-month intervals. This mathematical function (the orthogonal polynomial 
) appears to postulate an underlying law, governing the behavior of highway ton miles, The struo- 
tern of the cyclical movement of highway ton miles is predicated upon an assumption of eco- 
hm in the ton mile series. Even though trend and/or seasonal do change, the structural pat-. 
way ton miles can be expected to continue, showing much the same contour as before. In 
the contour of the structural pattern will tend to remain móre or less constant irrespective 
from which measured. From the standpoint of forecgsting this is a most important con- 


Investigation hes been in part analytic and in part synthetic. In othe? words, it first concerned 
T with а breaking down of the highway ton mile series into its component elements. In the latter 

the investigation directed its efforts to a recombining of the constituent elements into a theoretical 
eries, The synthesis makes possible an extrapolation or projection into the future. It also serves 
Or confirm the correctness of the analysis by reconstructing the highway ton mile series from ita 


te Tests for Comparisons of Rank Correlations. Herman O, HARTLEY. 
Aumerous measures of rank correlation the two best known are: Spearman's rank correla~ 
Kendall's rank correlation £j. Some results on the exact distribution of these measures are 
the case of O-correlation and some moments for the case of ranks generated by a bivariate 
ulation with correlation coefficient p. In the latter case the present note investigates the 
on or r, and tz, both by the method of statistical differentials and by Monte Carlo cal- 
18 found that for moderately large sample size both z-tranforms are approximately normal 
Approximately independent of p. This property results in simple tests of significance for 

Of rank correlations so transformed. The restriction to ranks generated by samples from 
aormal samples is lifted and the tests shown to apply to a much wider class of ranks. 


Cluster Size, Howanp L. Jones, 

A & two-stage Jampling procedure consists in selecting a number of clusters of equal size, and 

188 linear function of the number of clusters and the number of individuals selected, the 

‘Size is а simple function of the Sarameters of the cost function and the intraclass and 

elation coefficients. Suggestions for estimating these parametdis and coefficients are 
de 


D 


330 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


proposed and discussed for an illustrative example. The more general situation is then examined where 
the number of attempted selections is the same for every cluster, but the actual number of individuals 
inspected varies from cluster to cluster. 


A Note on the Use of Normal Probability Paper. HERMAN CHERNOFF and GERALD J. LIEBERMAN. 


This paper illustrates, with a special example, that the graphical technique to be applied to a 
problem should depend to a large extent on the use to which the graph is to be put. In particular, we 
treat the problem of selecting the representation of a sample on normal probability paper when it is 
desired to obtain “optimum graphical” estimates of the mean £ and standard deviation с of a normal 
distribution. 


‘Two Nonparametric Tests Using the Method of Ranks for Testing the Randomness of Samples Drawn 
from Finite Populations. Мттснктл, О. Locks, University of Oklahoma. 


In this paper are developed two nonparametric tests based оп. ће method of ranks, both of which 
may be used to test the randomness (or representativeness) of samples drawn from finite populations. 
The theory behind the development of these tests is essentially the same as that governing tests for 
randomness which use the original values. Since the application of these tests requires the ranking of 
all of the items in the population according to some numerical characteristic, the tests can be applied 
only if values with respect to at least one numerical characteristic are available for every single item in 
the population. Admittedly, this restricts the scope of usefulness of these tests. However, it is believed 
that the tests can be used more profitably for certain types of problems (e.g. tests for randomness of 
samples drawn from J-shaped and U-shaped populations) than other statistical tests for randomness 
now in use. 


А Two Sample Procedure for Linear Discrimination in Normal Samples. Jack Mosuman, Oak Ridge 
National Laboratory. 


‘The problem of discrimination from a normal linear regression model considered is that of obtaining 
a confidence interval for x corresponding to an observed y when the expectation of y, given z is E(y|2) 
=a*-+A*z. Various incongruities are discussed which result from a naive approach. A deterministic 
one sample procedure is shown to have two difficulties: (1) The procedure is inefficient in that the con- 
fidence interval obtained is larger than a specified 1 —y; (2) The excess is a function of the unknown 
variance. A two sample procedure is;exhibited which makes the excess over 1—Y to be less than а 
specified 5(0 <ô <y) whenever |8*| <l for any predetermined 1>0 and the excess is independent of the 
variance. 

> 

Probability of Acceptance for Sampling Plans Based on Average-Standard Deviation Acceptance Cri- 

terion. Harry G. Roura. 


The evaluation of a variables sampling plan using Average-Standard Deviation Acceptance Cri- 
terion requires Probability of Acceptance values for determining its Operating Characteristic. The 
mathematical relations are presented for computing such probabilities as developed in 1934 assuming & 
normal law distribution for the parent population and resultant exact distributions for averages and 
standard deviations where the correlation between averages and standard deviations is zero for two 
cases: 1, Sample size n large: distributions of averagemand standard deviations assumed normal; and 
2. Sample size n small: distribution of averages normal and of standard deviations non-normal. These 
resulta may be compared with later work started in 1947 at Stanford University by Goode, Bowker, 
Ireson, and Resnikoff. 

For the two cases a cutting plane slices the frequency surface for averages and standard deviations 
and volumes under the surface are determined for shifts in level (average), variability (standard 
deviation), or both, due to changes in the system of causes. Levels of incoming quality with respect to 
engineering requirements and the protection desired for any sampling plan determine the location of 
the cutting planes used for evaluation. The Average-Standard Deviation Acceptance Criterion ів set uP 
for either а minimum or maximum engineering limit for sample values of averages and standard devia- 
tions. For such criteria, relations are provided for determining the yolume under the surface to the 
right or left of the cutting plane covering both max. and min. limits. These relative volumes provide the 

s> desired probability of acceptance values for any postulated incoming quality p’. A simple approximation 
for n small is given. Exact and approximate P values for samples of 5 are presented for two sampling 
plans for evaluating the error of the approximation. A graphical solution of the multiple integrals i$ 
described. These probability of acceptance values must be obtained in order properly to set up various 
types of variables sampling plans. > » 


Tu 
 SUMMARIES OF PAPERS 331 


‘Time Series Factor Analysis with an Economic Application. Enxzsz W. Brown and Max A. Woopsury 
"di ‘thas been recognized that the application of many statistical techniques to time series is invalid 
due to the correlation of successive observations. Among the techniques affected is factor analysis, since 
many of the underlying relations, which a factor analysis should uncover in a time series, act with a time 

"delay, and hence would not be discovered. An appropriate definition of a factor and a technique of time 
_ "series factor analysis is formulated and discussed and an application to a set of ten economic time 
"series is made, The ten series cover national product, income and employment. In factor analysis, if 
_ "there are п original variables, no more than about n/2 factors can be identified. An interesting feature 
_ of time series factor analysis is that this does not necessarily hold true. In fact, factor analysis of time 
series may require, and allow identification of, more factors than variables. 


A Stochastic Model for the Setection of Macronuclear Units in Paramecium Growth. А. W. Кімва1л, 
and A. S, HOUSEHOLDER, Oak Ridge National Laboratory. 


> Prior to division, the macronuclear units in Paramecium (see, for example, Sonneborn, Ann. Rev 
| Microb., рр. 55-80, 1949) ргеватаЁйу double in number ina manner similar to chromosome doubling at 
mitosis. During the process of division each daughter animal receives approximately one-half of the 
Units present. Thus in any population of Paramecia, each animal has about the same number of units. 
› If there are n units per animal, there may be as many as n different types of units, or all units may be 
Ше, and any combination between these extremes is also possible. Such combinations are called states. 
- Given the state of a single animal and given an hypothesis about the selection of macronuclear units 
Т by daughter cells, the probability distribution of states after № divisions may be computed readily 
by the methods of stochastic processes. Results have been obtained for the hypothesis of completely 

_ Tandom selection and have been compared with experimental data. 


"Stochastic Processes and the Study of Growth Phenomena. A. T. Rum, Columbia University, 
ў This paper is divided into three parts: (1) Introduction, (2) Construction of stochastic models, 
And (8) Statistical inference in stochastic models. In Part 1 we discuss the deterministic and stochastic 
Approaches to the study of growth phenomena; and give an introduction to the theory of branching 
© processes ns developed by Bellman and Harris. We consider in Part 2 the construction of 
'arious stochastic models for growth using the above theory. Models for birth, birth-and-death, and 
| mutation processes are discussed. The use of these formal models in the study of epidemics and rumor 
Т; ead, ав well as in the study of bacterial growth, is pointed out. In Part 3 we discuss problems of 
| ‘Sstimation and testing associated with stochastic growth processes. Previous work is reviewed, and 
‘Some recent investigations on sequential decision problems for branching processes is discussed. 


The New Bureau of Labor Statistics Indexes of Productivity in Manufacturing. LEON GREENBERG and 
М Аах BEARLE б t 


‘The Bureau of Labor Statistics is planning to publish several series of productivity indexes in 
zo manufacturing covering the years 1939, and 1947 through 1952. Two of the series will show changes in 
man hour requirements for physical production in manufacturing. A third series, based on concepts 
Aniar to the gross national product approach, will show changes in man-hour requirements for dollar 
fits added in manufacturing and will reflect the influence of changing inputs of material, supplies and 
fuch, The various measures fill in a statistical gap of 13 years for which reliable all-manufacturing pro- 
ity statistics are not available. é 
The indexes are being derived from secondary sources of production and man-hour data, In relating 
5%] © statistics to derive productivity ratios numerous data problems are encountered. Not all 
aS them can be solved immediately and the indexes released by the Bureau will not be precision instru- 
ments, But they will still be useful and necessary indicators in the broad areas of economic and business 
р manpower planning and productive efficiency. 


Problems and Prospects of Criminal Statistics in the United States. THORSTEN SELLIN. 
[as Criminal statisties issued by agencies dealing with crime or criminale have been assumed to be 
a C Sources of data for the scientific study of the trends of criminality, the characteristics of 
quiere and the efficiency and effectivéness of the agencies dealing with them. The dearth of good 
Dart of АНН ев in the United States is discussed and reasons for this situatiGn considered. The main 
Sf the paper examines the problem of how such statistics can be used in the study of trends of _ 
ity and the characteristics of the offender, and raises a number of questions designed to stimu- ^ 
в Bii icum of this problem. Illustrations are offered pointing to the need of improving certain 
Eire now issued, attention being drawn in particular to the need for some revision of the 
А ication of offenses employed by fany national and state agencies and the need for more 
СУ computed crime rates than those now being published. 
e PESE 


е 


332 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


Organization and Functions of a Complete Centralized Statistical Center at a Land Grant College: 

T. A. BANCROFT, Тоша State College. 

The position is taken that there is no unique optimum organization or program for a statistical 
center which will work equally well at all universities or even at all land grant colleges. Further, even at 
the same institution either one of several alternative arrangementa might work equally well. A complete 
statistical center should provide: (i) a research and teaching program in statistics per se in order to de- 
velop new statistical theory and methodology and train statisticians; (ii) a service teaching program to 
provide for basic general courses in theory and methods and specialized courses in statistics for students 
majoring at the undergraduate and graduate levels insome other substantive subject matter area; 
Gii) a consulting service program, i.e. recognized and budgeted time for various staff members of the 
statistical center to consult with research workers on investigations involving the use of statistical 
theory and methods; and (iv) a computing service for the programming and analyses of data resulting 
from research investigations. These four main objectives could be accomplished by a separate depart- 
ment of statistics, or, in case of a small group, a sub-department, and a campus-wide statistical labora- 
tory of institute status. . 


Role of a Centralized Statistical Organization in a University. J. С. DAnnocn, Washington State College, 

‘The recommended relationship between the statistical organization and the various research units 
of the University may be set forth as it affects some of the major units. (1) The Agricultural Experiment 
Station—two or more staff members assigned to service this unit, with the privilege of calling on their 
associates to aid in special problems or to share the load at seasonal peaks. The consultor should be 
Permitted to exhibit a preference among the possible consultants, if he so wishes. Some Directors might 
prefer to have the consultants attached to their staff but the preference would seem to be to house them 
with the statistical group, unless there is a distance problem, and have them administratively responsible 
to the head of this same group. (2) The Department of Mathematics—the liaison here, established by 
joint appointment or some other effective means, needs to be rather close. The relationship here on 
consulting problems may well be one of the statistician consulting with the mathematician. (3) The 
Social Science Departments—the needs here could best be served by staff members specifically trained 
in certain statistical disciplines, again a matter of assignment. (4) Engineering, Medicine and other 
fields—while specialized knowledge would be an advantage there are many problems where a straight- 
forward statistical approach would-be of material assistance to the research program. The inference here 
is that much can be done by the general consulting staff in the early development of a consulting load. 
As the loads from these areas build up*it would be advisable to add personnel with background training 
in the various fields. (5) The computing facility—for statistical problems the consultant should provide 
the liaison between the worker and this unit. Thus it would appear desirable that the computing unit 
be an integral part of the organization under discussion. 

The other aspect of the consultation program that should be mentioned is that having to do with 
the needs of graduate students. The most desirable pattern would seem to be that of the graduate stu- 
dent and his major professor both participating at least in the initial consultation. At that time the 
relationship of the consultant to the research problem should be fairly clear, and responsibility for 
direction of statistical matters and for computing decided upon. 


being considered to be involved whenever the proposed experiment is to produce sampling data. (b) This 
committee ів to make written recommendations to the directors of the research programs, including the 


result from the research they are subject to the rules of (a) and (b). 
Most colleges and universities have limited funds, and Someone must be convinced that the research 


® tistical research. The division should not necessarily be in thirds; emphasis should be on the activity 
best adapted to each person's interests and abilities, This plan should point statistical research 
problems the applied scientists need to have solved, and should keep the statisticians from getting into 
intellectual “ruts.” Some may think the cost of paying several persons capable of research instead of 
one or two who do nothing else will be greater. Many of our young Ph.D.'s are capable of research, 


8 OF PAPERS 333 


‘valuable experience in such positions. If they like this multi-purpose activity and choose 

it, their productivity should increase with the cost of keeping them indefinitely. Before they 
ianently on such a job they will realize there is a limit to 
lere if they prefer. In the meantime the school will ben: 


wish to supplement the above plan with 
lent research personnel who wi&h to do full-time theoretical research. 


The number and qualifications of the members of the staff of the Central Statistical Organization 
"upon the functions and responsibilities that are assigned to the Central Organization. These 
grouped. under the folldwing heads: (1) Consultation, (2) Teaching—theory and application of 
| methods, (3) Evaluation of manuscripts where statistical methods are used, (4) Research, 
ical and applied, and (5) Computing Service (machine room). 
large university, the technical staff of the Central Statistical Organization should be composed 
sts in various statistical fields. Where teaching is one of the functions, the staff will have to 
large and those who teach should also do research and consultation. Where teaching is de- 
joint appointments of at least some of the teachers would seem to be desirable in order to 
Central Organization in close relationship with those departments where problems in the 
ition of statistical methods generally arise. The plan of joint appointments is working quite 
ОШУ in the University of Florida, where the teaching of statistics is not centralized. 
small university, the same general organization would seemedesirable, but if funds are limited 
ids less varied, the staff will be small. This calls for greater versatility on the part of the 
of the staff. While in a large university the Central Statistical Organization may be, and 
an independent unit, in a smaller institution it may be necessary for financial or other rea- 
е it a branch of the mathematics or some other department. The computation service, or 
Toom, is an important section of the Central Statistical Organization. To be effective, it must 
led with a competent supervisor familiar with statistical methods, as well as with competent 
eh operators and machine operators. Where courses are givenin the university in the operation 
"and other types of computing machines, these can serve to train potential replacements and 
to the card machine computing section. It is a difficult problem to find persons with the 
qualifications who are willing to accept the starting university salaries because of the demand 
and government for statisticians at salaries above those in most universities; and this, the 
| finding a staff at all, is perhaps the most difficult of all staffing problems. 


э 
ity of a Centralized Statistical Organization in a University for Training in Statistics. 
LLEN WALLIS, © Б 


partments in fields other than statistics normally do not have faculties qualified to teach current 
methods, во thé responsibility for statistics courses should not rest with them. On the other 
tistics departments vested with sole authority over the statistics courses may tend to over- 
© statistics and its mathematical prerequisites at the expense of the studenta’ subject matter 
we have developed what looks to be a promising method for reducing duplication 
tics courses in the various departments and raising the level of instruction, while avoiding the 
f authority centered in the statistics department. 
first step was to introduce a new elementary course taught by one of our own men. After this 
Well developed, people teaching statistics in various departments were invited to merge 
9 With ours. In general, their reaction was to introduce our syllabus and other materials but 
tir instructors and the independence of their courses. The syllabus and teaching materials were 
Inewhat in the light of suggestions and criticisms from these instructors in other departments, 
Опе or two departments did merge their courses with ours. As those departments reported 
ir experience, others took the plunge. Essentially the same evolution has occurred 
the advanced course on statistical inference. m 
mugs System, each department Нег the unified course in ita own section of the catalog with ite 
but all designate the same room and instructor. Any department could secede simply by 
Fann room assignment in its Announcement. Actually, this would fot be likely to happen 
Consultation with the statistics department, since other departments know that we are sensitive 
eeds and limitations as well as to our own standards and requirements. Unification of statistics 
its their being too closely tied to a narrow range of substantive problems, encourages 
iples from a variety of fields, and is generally an advantage pedagogically. 
the device used at Chicago would Work at other universities is a question that must be 
context of each institution. If, with the elimination of a course to reduce duplication, 
z 


e 


a 


334 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


a department suffers cuts in appointments and budget, then it will be less ready to reduce the duplica- 
tion. Some budgetary readjustments may be in order; but so long as cuts are not allowed to work to 
the direct disadvantage of those who help achieve the gains, many of the obstacles to eliminating 
duplication in universities seem to be avoided. 


Trends and Cycles in German Wages. Geruarp Bry. 

‘The paper compares German, English, and American long-term trends in wage levels and in wage 
structure, as well as their cyclical behavior, from 1871 to 1953. Hourly money wages are shown to 
have increased about seven fold in Great Britain, eightfold in Germany, and about ten foldin the 
United States. The divergence in the increase of real wages—two and one-half fold for Germany and 
Great Britain and fourfold for the United States—indicates the close relationship between wages and 
economic and political fortunes in these countries. A tendency toward decreasing differentials can be 
observed in the three countries. This trend towards greater equality is relatively clear in German skill, 
sex, age, and regional differentials, It is suggested also in city size and industrial differentials. Basically, 
this trend follows from the industrialization process itself, whicli leads to reduced differences between 
wage earners, 

Although wage rates respond to general business contractions—by decreasing rates of growth, stag- 
nation, or declines—there is disclosed a definite downward rigidity of wage rates, During the period 
1871-1953 only two substantial declines in German money wage rates occurred. Earnings have larger 
cyclical amplitudes than rates because overtime, output bonuses, and hours worked are cyclically more 
sensitive. Even in the Great Depression, real weekly earnings in Germany and in the United States 
declined by only 15 per cent. In*both countries the real sufferers from the Great Depression were the 
unemployed rather than the employed workers, 


Union Impact on Wage Structures. Н. M. Юостү, Bureau of Labor Statistics. 


A series of studies by the Bureau of Labor Statistics indicate that relative wage differentials among 
occupations have declined markedly in American industry, particularly during recent years. Over the 
past half century, several factors calculated to narrow job differentials in the long run can be distin- 
guished, Among these are the sharp decline in immigration during World War I and the subsequent 
adoption of a restrictive immigration policy; a declining birth rate until the 1940's; a rising level of 
education and training among the working population; and the mechanization of large areas of unskilled 
work. These broad labor market forces clearly produced some tendency for relative job differentials to 
diminish. The decline in job differentials during the past decade, however, was much sharper than would 
have been anticipated on the basis of long-run labor market forces alone. One factor was governmental 
wage policy, notably in the sconomie stabilization program during World War II. Of major importance 
was a.marked tendency for unions, beginning in the defense period in 1941, to formulate their wage de- 
mande in terms of uniform money increases (and hence unequal percentage increases among jobs) and 
settlements to be made, in this fashion. H 

Among the reasons for this development were: (1) uniform money increases tend to appear more 
equitable in an inflationary period when wage increases are designed largely to offset increases in living 
costa; (2) it is politically easier, especially in an inflationary situation, for union leadership to press for 
uniform money increases; (3) within limits, skilled workers may be content with the maintenance of 8b- 
solute wage differentials, even though their relative wage position is deteriorating; (4) on the whole, em- 
Ployers during the past decade have seemed more cozcerned with the size of negotiated wage increases 
than with the form of their distribution; (5) the structure of wages probably seems less important than 
the general level of wages, at least in inflationary periods. The decline in relative wage differentials 
among jobs has been much more marked in some industries than in others. Divergence in experience 
within manufacturing is well-illustrated in the basic steel and automobile industries. Assuming reasons- 
ble economic stability, it is probable that relative job differentials will receive considerable attention 
during the next few years from unions and employers. Special wage adjustments for skilled workers 0 
some industries during 1953 point in this direction. Systematic review of the structure of job rates AY 
occur in a significant number of situations. 


Wages Since 1914. Lro Worman, Columbia University. 


, , This paper deals With one segment of wages in this country, the behavior of hourly earnings in fv 
industries—manufacturing, class I railroads, building, anthracite and bituminous coal mining. Avera? 
hourly earnings, gross and net, are presented as an approximation of changes in the price of Jabor 
during these 40 years. Despite interruptions in their upward movement, money and real hourly wage 
multiplied many times over the period. All five groups shared in the rise, but unequally. Antbrac! 

coal topped the list with nearly a 10-fold increase; building tades’ rates rose least, less than 0100. 
Real hourly wages likewise mounted, but, of course, less than money, wages. Anthracite real wages T08? 


л - ) 


SUMMARIES OF PAPERS 335 


8,5 times, manufacturing 2.8 and building over 2 times. The interesting question is, what economic 
conditions accounted for this multiplication in real and money wages? Using manufacturing wages 
as an illustration, it is clear that the bulk of the total money advance of $1.47 an hour was a product of 
the economic conditions of war and postwar boom, since only 10 cents of this amount were added be- 
tween 1920 and 1940. 

When money wages are converted into real wages, surprising results are obtained. For real wages 
showed marked improvement without apparent regard to movements in money wages, business condi- 
tions or union organization, Comparing {Ме course of money and real wages in the two wars one encount- 
ers many difficulties. The first was shorter and didn’t have wage controls. In several industries wages in 
World War II were fixed in long-term contracts. Bearing these factors in mind, the available evidence 
suggests that money and real wages rose more between 1916 and 1918 than from 1941 to 1945, This rec- 
ord of 40 years suggests that the period beginning in 1914 can be described asa. high-wage era in the sense 
that the increases of this period were not matched. by the wage advances of the preceding half-century. 
On the question of organized labor's effect on wages, the evidence is conflicting. Attention should be 
called, however, to the striking rise in manufacturing real wages between 1929 and 1940 when average 
unemployment stood at a consistently high level and when the only tenable explanation of the rise in 
teal wages must be public policy and trade unionism. 


Limitations of Consumer Credit Statistics. Ernst A. Daven, Household Finance Corporation. 

Definition and content, All elements in the Federal Reserve Board's definition are relatively simple 
and appear clear. However, differences arise in attempting to apply the definition, because there are 
legitimate differences in point of view, and because different people expect the figures to serve different 
purposes. Even the Federal Reserve Board does not seem to apprecigte fully that difficulty also arises 
because the consumer credit total figure is used, as a tule, for an altogether different type of analysis 
than the break-downs. Their decisions on the items to be included for one type of analysis make it 
impossible for the resultant figures to serve properly in another analysis. The basic published estimates 
are limited to the amounts of credit outstanding. Figures showing the flow of consumer credit, if added, 
would provide a much more complete understanding of current developments. 

1 Methods of estimation. The recent revision has distinctly improved the technical accuracy and cover- 
Age of the estimates. Even with their shortcomings, they are substantially better than the statistics 
Available in most economic areas. I am concerned about the broader, non-technical aspects of the prob- 
lem. Trained statisticians are aware: that all economic statistics measure differences of degree; that all 
Classification is subjective; and that every problem really requires its own set of statistics, if they are 
tobe truly germane. Most users, lacking formal training, are misled by the aura of accuracy and precision 
which surrounds the mere issuance of estimates by a federal government agency. A clear, concise, and 
simple disclaimer is necessary to dispel the mirage of infallibility. Д 
i On the surface the estimates provided for commercial banks—the most important single type of 

‘older—appear to be the most thoroughly grounded of any of theseries. Yet, the whole procedtire in- 
volves guesswork, because the banker frequently does not know the use of each loan, at the time it is 
made; and experience shdws that many respondents give whatever figures ar@easiest when condition 
«Doris are prepared, semi-annually. The correction factor applied by the Board to exclude the non- 
Consumer portion from the reported figures is based upon a single survey—a sample which seems wholly: 
inudequate, Tho validity of using the findings of this single survey from 1939 to the present time, and 
Into the indefinite future, is questionable. 
adi P Tesentation and classification. The presentation of data in the Federal Reserve Bulletin representa 
i EU Improvement over the old tables. This is true, particularly, of the functional break-down with- 
hee instalment credit sector. Other basic tables break down the amount of instalment and non- 
Ig Ment credit respectively, according to holder. Here the decision to treat sales finance companies 
should 2009809 basis (including the operations of cash lending subsidiaries) appears unwise and 
can be ten ей. To attain the stated objective of the Board (topresent the data in such detail that it 

faken apart and put together by analysts with various interests and various types of problems), 


oe finer break-downs are necessary—and could be obtained if the Board made a serious attempt to 


Ri x 
“ent Revisions on Consumer Credit Staftstics. Houer Jones, Federal Reserse Board. 

Q9 Federal Reserve Board in 1053 completed a revision of the statisti regarding consumer 

ir ia tanding from 1939 to date. The new data are the first thorough revision of the series since 


eit inception in 1940 f Business, The revi- 
Е ; and were made possible by the results of the 1948 Census of Business, 
en resulted in a substarftial increase of estimates of consumer instalment credit outstanding and a sub- 


t Teducti 8 5 
antia] duction in estimates of charge accounts outstanding and a moderate increase in total con- 
Medit, The revised data and a description & the process of revision are ишеп the Foaral 


oe ү Ае, 


336 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 
1 


Reserve Bulletin for April 1953. Technical aspects of the revision are discussed in a supplementary 
pamphlet issued by the Board. While the revision of the series was desirable and necessary, it appears 
that the unrevised series had never been misleading as to order of magnitude of amount outstanding 
or of change in outstandings. Furthermore, it is believed that the use of annual Census Bureau data from 
the retail trade survey will provide a much more accurate basis for estimates in the future. 

In connection with the revision of these series, a thorough evaluation has been made of the concepta 
involved and the means of making current estimates. Denomination of the series as "short- and inter- 
mediate-term consumer credit" has emphasized the fact that it covers only part of consumer credit, since 
house mortgages owed by owner-occupants are excluded. The “service credit” series has been thoroughly 
revised, primarily by basing its major component, amounts owed to medical practitioners, upon data 
from the Survey of Consumer Finances. Treatment of data supplied by commercial banks has been 
improved upon a basis of the results of sample surveys. 

The Function of the Outside Consultative Committee in the Revision of Governmental Statistics. 

В. D. MupazrT. 


Tn the revision of governmental statistics, there are five parties at interest—employers, employees, 
the government bureau in question, the outside public, and the whole body of statisticians interested 
in the particular statistical output. Employers' and employees' interests may or may not coincide with 
the ends sought in a revision; advantage accrues to the general public from a strong and healthy econ: 
omy and statistical procedures that assist this long run objective are іп the interest of the general public. 
The bureau personnel is in general highly competent but is beset both by routine operations that 
resist change and by outside pressures of groups dominated by their self interest. The whole body of 
scientists, as scientists, must be p;esumed objective and their pressures therefore lead in the direction of 
constant improvement of statistical processes. Herein lies the good earth that must be worked by the 
committee of “experts” when they are asked to sit with bureau personnel in the revision of important 
bodies of governmental statistics. It is this situation that gives them a function the performance of which 
may aid in the improvement of government statistics. They operate to perform three tasks: (1) Better 
theory—the clarification of concepts back of particular bodies of data or particular statistical tools 
through give and take discussion between consultants and bureau personnel; (2) better liaison between 
government bureaus and the outside public, leading to education of the public on the significance of the 
work of bureaus and on the competence of staff; and (3) encouragement toward working conditions that 
will attract competent personnel and lead to improved career service in governmental agencies 


A Follow-up Study of Mortality in World War II Prisoners of War. Brrnanp M. Conex, National 

Research Council, and Maurice 7. Cooper, Veterans Administration. 

‘Using tested methods of follow-up by matching existing military, Veterans Administration, and 
other records, and by questionnaire survey, the Committee on Veterans Medical Problems, National 
Research Council, is conducting for the VA a follow-up study of mortality, morbidity, disability, and 
adjustment in U. S. white male Army personnel who were prisoners of war in World War II and liberated 
alive. Representative samples of both Pacific and European ex-prisoners, together with appropriate 
control groups, are being followed. This initial report is confined to mortality as observed in the first six 
years after liberation, and deals primarily with methods, The main methodological features are: (1) 
Demonstration from previous experience of the virtual completeness of VA death records of servioe- 
men and veterans, permitting accurate measurement of mortality differentials, and providing reliable 
estimates of possible error where the differentials are small. (2) Selection of control samples from Army 
unit records to match controls with prisoners in proportion of ground and air forces, officers and enlis 
men, and time distribution of and opportunity for capture, the latter two factors being better 8p- 
proximated in the European than in the Pacific area. (3) Use of (a) an additional control, a sample of 
veterans generally, to evaluate the effects on mortality of selection for military service, and (b) popula- 
bh zn. table mortalities as an intermediate device to derive stable, annual, age-specific mortality ех“ 

ions. 

The main findings are: (1) A marked excess of mortality in the Pacific prisoners, somewhat concen” 
trated in the first two years after liberation, and no excess of mortality in the European prisoners 
(2) In the mortality excess of Pacific prisoners, tuberculosis and automobile accidents are the most 
conspicuous causes of death. 


— Mortality Investigations of Physical Impairments. E. A. Lew, Metropolitan Life Insuranct 
'ompany. 

Life insurance mortality investigations may be regarded as a classical example of long rang? 
follow-up studies. They have usually dealt with relatively large numbers of persons who have beer 
automatically traced for long periods of time through the circumstance of their being insured. The PI?" 


SUMMARIES OF PAPEAS 337 


cedures for such studies were fully developed by actuaries more than a hundred years ago. The paper 
draws attention to the salient features of several types of life insurance mortality investigations and to 
the essential procedures used in such long range follow-up studies of physically impaired lives. It out- 
lines the scope and the principal findings of the more important investigations of physical impairments, 
and discusses both the limitations and the special value of the data for public health and medicine. 

The following mortality investigations of physical impairments are specifically referred to: Medico- 
Actuarial Mortality Investigations, 1912-1914; Medical Impairment Study, 1929 and its Supplement; 
Medical Impairment Study, 1936; Medical Impairment Study, 1938; Blood Pressure Study, 1939 and 
Its Supplement; several recent studies of the effect of build on mortality; several recent studies based 
оп follow-ups of disabled policyholders. 

Reference is also made to the Impairment Study, 1951, a comprehensive intercompany investiga- 
tion of the past fifteen years’ experience under some 132 groups of physical impairments, which is due 
to be published by the Society of Actuaries in the spring of 1954. 

The findings of these investigatigns have made it possible to express the long range prognoses for 
a wide variety of impairments in numerical terms. They have been particularly valuable in shedding 
light on those impairments that fall in the broad range between good health and disease as recognized by 
clinicians, for example, overweight and moderate elevation in blood pressure. Follow-up studies of 
bolicyholders who recover after incurring a disability while insured have produced data indicative of 
“ long range prognoses for the more serious impairments, such as advanced tuberculosis and coronary 

lusion, 


Factors in Interpreting Mortality After Retirement. Rosert J. Myer, Social Security Administration. 

Currently there is considerable discussion as to the advantages of individuals continuing in employ- 
ment beyond age 65 rather than being forced to retire compulsorily. Such advantages accrue both to the 
individual and to the nation. One of the subsidiary advantages frequently claimed is that an individual 
who is compelled to retire will lose his vitality and die much earlier than if he were allowed to continue 
in gainful employment. This runs contrary to the viewpoint frequently expressed several decades or 
more ago that workers were being kept in harness until they dropped dead from exhaustion rather than 
being allowed to spend their declining years in peace and leisure. 

Unfortunately, specific and reliable data as to the effect of retirement on mortality are not available. 
Tho analysis is complicated by the question as to whether people retire because they are disabled and 
thus subject to high mortality or whether the retirement itself produces the high mortality. Clear evi- 
фсе is available that retired persons do have higher mortality than active workers and the general 
Population at the same ages, especially in the first few years after retiremept. These data are presented 

various Governmental retirement systems and for certain selected non-Governmental programs. 
Consideration is also given for various types of retirement systems as to how the resulting mortality 
‘perience may develop and what biases and limitations may be present solely because of the particular 
Provisions of the plan rather than because of any underlying mortality effecta. * 


Some Observations on the Inequality of Incomes. Herman P. MILLER К 

Тың МОш&Вош history philosophers have speculated about reasons for the inequality of incomes. 
farious explanations have been offered. Sone have stressed ability, others chance, and still others 

za tutional factors which give the children of the wealthy an undue advantage. This paper offers a 

de explanation for the skewness of the income curve. The thesis is that the skewed income dis- 

T Ron reflects the merging of several symmetrical curves which differ only with respect to the level 

the ша of incomes. To explain income inequality it is first necessary to understand the reasons for 

ferences in the component parts of the income curve. 

Much of the skewness of tho income curve is due to the inclusion of women in the distribution 
or the pan’, between income distributions for men and women has little to do with chance, ability, 
(over goession of private wealth. Among men, nearly three-fourths of the highest income group 
Sr $10,000) are independent professionals, businessmen, or managers. To the extent that there is 
тергее, „КУ into these occupations, income differences between these groups and others may merely 
not now „ре Payment by society for rare sfülls or risk taking, The facta regarding freedom of entry are 

Tow adequately known, е 


Y 5 
om Duellings in Relation to Income. Мавалвет Rem, University of Chicago. 
is analysis of valde of owner-occupied dwellings in relation to income using data from the 1950 
oy o af lin а preliminary stage. Major findings include the following: (1) Coefficients of elasticity 
ооруш peling in relation to income with тїрїп by income of primary families and individuals 
(Standarg ^T had a considerable range, e.g. from .18in Cleveland to .63 in Bizmingham. For 8.М.Ав 
Metropolitan Areas) in {éneral it appears to be around .30. (V=a-+bI, where V equals the 


€ 
e ‘a 


338 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 194 


logarithm of value of owner-occupied dwelling and I equals the logarithm of income of primary families 
and individuals occupying them.) Many factors appear to contribute to this relatively low coefficient, 
as well as to differences among Б.М... .g. distribution of hoseholds by age and sex of head and random 
variations in income. Households with head 65 years or older or with female head tend to have a high 
value-income ratio and tend also to concentrate at low incomes. In addition, random variations of in- 
comes tend to concentrate at low incomes persons with a ігайвіќогу decline in income and at high levels 
those with a transitory increase. This distribution of transitory incomes interferes seriously with differ- 
ences in current income indicating what families tend to do when changes ocour in income status. How- 
ever the coefficient of .30 is quite similar to the corresponding coefficient of imputed rent of owner- 
occupied dwellings in relation to income, for places surveyed by the U. 8. Department of Labor during 
the thirties. (2) The average value of owner-occupied dwellings for various S.M.As., standardized for 
income level of the family, is related to the income of the S.M.A. Thisis consistent with earlier observa- 
tions that expenditure levels appear to be a function of the income of the community as well as the cur- 
rent income of families. There seems good reason to believe thal if the effect of random variations in 
family income is eliminated the association of value of dwelling to community income as a separate fac- 
tor is no longer present. (3) Coefficients of elasticity of value in relation to income from a grouping 
of households by value of dwelling tend to approximate 2.0. Since value of dwelling probably has less 
random variation than current income, it may be that this regression provides a better measure of the 
effect of difference in income status on the value of dwelling than does a grouping by current income. 
Differences in level of the value curve among 8.M.As. from this classification are related to age distribu- 
tion and not to income of the community. (4) Intergroup value-income relations among major portions 
of the New York S.M.A, and among census tracts of several places yielded coefficients of elasticity of 
value of dwelling to income around 2.0. 


Changing Geographic Patterns of Migration in the United States. Henry S SunYocx, JR., U. S. Bue 
reau of the Census. 

In order to study trends in net migration by States and possible changes in the overall pattern of 
interstate migration, figures were compiled from a variety of sources, Many of the data used were 
estimates especially prepared for this purpose. The following nine periods were included: 1950-1952, 
1949-1950, 1945-1950, 1942-1945, 1940-1942, 1935-1940, 1930-1935, 1920-1930, and 1910-1920. Using 
the States as units, product-moment correlation coefficients were computed for all pairs of time periods. 
Of these 36 correlations, 31 were significantly different from zero at the .05 level and were all positive. 
The median value of all the correlation coefficients was --.58, and the range was from —.03 to +.03. 
‘There did not appear to beany systematic changes in this pattern over time; adjacent periods were not 
more highly correlated than nonadjacent periods. Furthermore, the pattern by States did not seem to be 
related to the gross volume of intérstate migration of a given period. There were indications that the 
pattern was sensitive to economic conditions and to war conditions. The relationship was not & simple 
one, however, > ^ 

These conditions have more obvious effect on the figures for some of the individual States. Many 
of these States do show fairly definite trends in net migration. The year from April 1949 to April 1950 
had the most atypical pattern of interstate migration. For this year and the period 1935-1940, statistics 
are available on migration in both directions between pairs of States. Four States (California, Michigan, 
Tennessee, and Texas) were selected for intensive examination. For each of these States, the im- 
portant interchanges with other States were listed, and changes between the two periods were noted. 
Some of the most important shifts in migratory currents were thus detected. The quality of the basio 
data, additional information needed, and some questions for future examination are also discussed. 


The Problem of Improving Mineral Statistics. J. E. Morton, Cornell University. 

The problem of improving mineral statistics is only a part of a universal problem: the improvement 
of the systematic mass production of economic data in general. That the collection of mineral statistics 
should create a special problem may be attributed to the late but rapidly growing interest in the mine 
sector of our economy; in part, the problem is due to the technological peculiarities of the extractive T 
dustries. Weak-spots and other “pathologies” in a given fact finding machinery are easily identified 
one projects such mêchinery against generally accepted requirements and standards. Examples of suh 
requirements applying to any efficient and well organized mass data production process are: (1) The 
proper specification of the end product; (2) The efficient production of the data while adhering 0 a 
cepted statistical standards; (3) The quality control of the product to protuct the data-consumer кт, 
to yield a basis for continuing improvement of the data production process. Comparing—from y 
above points of view—mineral statistics with other fliajor types of economic data collection systems, Ur 
agricultural, manufééturing, population and employment statistics, one must admit that mineral fag 
tistics frequently violate the above skotched requirements. The resting weaknesses have been reflect 
in such diagnostic reports as that by the Hoover Commission, the President’s Material Policy Commis 


SUMMARIES OF PAPERS 339 


sion and the American Statistical Association's Survey of the Statistical Operations of the Bureau of 
Mines. 

In conclusion, it appears that several basic steps will have to be undertaken before special thera- 
peutical measures can be applied effectively. These steps are: (1) The development of a broad, yet. 
specific, mineral statistics policy which will not only recognize the difference between the needs of the 
technologist on one hand and those of the business analyst and economist on the other, but also the dis- 
crepancy between the commodity and the industry-wide point of view. (2) The adherence to modern 
statistical standards and the development of new methods and procedures where justified by the 
peculiarities of the extractive industries. (3) The reconsideration of the administrative framework 
within which efficient production of mineral statistics is to develop, including the very important ques- 
tion of how to generate, attract, and best utilize the particular kind of rare combination of talents and 
skills needed for the successful operation of a mineral statistics program. 


The Economic Outlook for 1954. Севнанр Cors, National Planning Associates. 


To maintain "full employment" pext year an increase in total production of about $10 billion would 
be necessary. Surveys of present spending intentions of consumers, business, and government for the 
next year do not indicate any sharp increases or decreases in total buying but on balance a mild further 
decline. The defense demand of the Federal Government may be down by $2 or $3 billion next year, but 
‘the continuing rise in state and local expenditures for roads, schools, hospitals, and other improvements 
will offset much of the decline in Federal demand. Business enterprises are planning only slightly less 
spending for plant and equipment in 1954 than in the peak year 1953. According to a recent survey of 
consumer attitudes, people in general are optimistic about next year's income and feel that the present 
іва “good time to buy." However, some decline in consumer incomes'is likely. Since there is no indica- 
tion that consumers intend to spend a larger proportion of their incomes, consumer spending is likely 
to show a moderate decline, If one simply puts the fragmentary indications of present intentions into a 
coherent picture and allows for some decline in inventories, one would reach the conclusion that the year 
1954 would bring a reduction in total production of about $10 to $15 billion instead of the desirable 
full employment increase of $10 million. This would mean a level of activity of 5 or 6 per cent below the 
full employment level. Unemployment might rise from 1j million in 1953 to perhaps 3} million in 1954. 

Before accepting this as a “forecast” it must be recognized that present intentions, which are 
the basis for this outlook, may well be changed. In view of the possible weakness in markets for some 
Roods, it is possible that business might engage in rather substantial inventory liquidation and might 
tito revise downward its expansion programs. On the other hand, forward looking businessmen may 
think that conditions warrant a stepped up modernization program. They might also push ahead the 
development of new products and make the purchase of goods more attractive so that consumers are 
Persuaded to use some of their liquid reserves for increased purchases. Finally, the Government in 
reponse to an economic downturn might adopt tax reductions beyond present plans, might adopt 
[пош measures to stimulate the construction of residential houses, or might step up other useful 
Programs, In our present state of knowledge it is only possible to indicate the ecgnomic trend on account 
of present intentions. This trend is mildly down. If the community maintains its confidence that the 
Government is ready and willing to act, it should be possible to prevent the downturn from developing 
md depression. How consumers, business, and government respond to this trend, whether their 
eia will aggravate, mitigate, or reverse it, can be a subject of discussion but not of any forecast 
lich pretends to be more than one man's opinion. 


Economic Forecast of the Agricultural Situation, 1954. Onis V. Werts, U. S. Department of Agriculture. 
in agi marked change in the domestic demand for food and other agricultural products appears likely 
ply 15 rompared with the current year. Also, foreign takings of United States farm products, while 
"Ply reduced in the 1952-53 season from other recent years, appear to be at a level sustainable over 
ls year or во, Supplies of most farm products are expected to continue large in 1954. Carryover 
the Covey Increase further by the end of the current marketing year, but a large part will be held by 
jos S VeTIIent. Acreage restrictions are likely to bring smaller wheat and cotton crops in 1954 and 
pen UPPort programa will continue to cushion the impact of large supplies on farm prices. With pros- 
Bittive conditions of demand and supply fer farm products in 1954 approximately the same as in 1953, 
stabilia ОГ Prices received by farmers may hold near current levels. With, cost rates to farmers 
йш, the coet-price squeeze in agriculture is not likely to be intensified significantly in 1054. 


Achieving Maximum Prediction per Unit of Testing Time. Јонх T. Darter, Bureau of Naval Personnel. 


In an aptitude battery of fini i how long and thus 
i finite | posed of pools of homogeneous items, long 

how reliable should ар be in wae Entra the composite validity of the battery and thus 

TER ‘um prediction per unit of testing»time? For tests composed of hgmogeneous items, test 

“Ау and reliability vary conogmitantly with the numberof ame, and varying the iet length alters 


é e 


340 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


both reliability and validity in a predictable manner, Formulas are derived and presented for predicting 
correlations with testa of altered length. 

"When homogeneous items are added to a given test, the amount each successive item adds to the 
multiple validity falls off sharply. Items added last to a long test of highly valid type items may add less 
than items added first to a less valid type of test. Data are presented to predict the results of a shortening 
of each test in the Aviation Cadet Classification Test Battery. It is demonstrated that, for a given 
amount of testing time, apprecisbly greater multiple validity may be obtained by using a large number 
of short tests rather than a smaller number of long tests. These predictions were verified by an empirical 
study. 


Family Interaction and the Transmission of Achievement-related Attitudes. Fren L, бткортвеск, 

University of Chicago. à 

Forty-eight recorded discussions between father, mother, and adolescent son were analyzed and 
significant negative correlation was found between the power of the father in the family decision- 
making and the score of the son on an achievement-related atti*ude scale. The families were second 
generation (the son third), of Jewish and Italian ethnicity, divided between over- and under-achieving 
students and stratified into three socio-economic status groups forming a 2 X2 X3 factorial design. The 
frame for selecting the sample was created by administering questionnaires to all children between 14 
and 17 in parochial and public schools in an Eastern City. The study is a part of the research on the 
early identification of talented persons sponsored by the Markle Foundation through the Social Scienco 
Research Council. 


The Validation of Testing Prograihs for University Students. ҮҮпллам B. Scurapur, Educational Teale 
ing Service. 

"The widespread use of tests for selection and guidance in American universities has been accom- 
panied by the growth of testing programs. A testing program may include all the steps needed in col- 
lecting data on test performance by examinees and in transmitting the test results in a convenient form 
to appropriate test users. Validation of tests for predicting academic achievement contributes to а pro- 
gram by aiding in the evaluation of program components, in the improvement of test offerings, and in 
the effective use of test results. 

A comparative study of mathematical aptitude and achievement materials for predicting engineer- 
ing school grades provided useful information for an administrative decision by the College Entrance 
Examination Board. A simple method of assigning optimal testing times to test parts indicated that the 
Law School Admission Test could be appreciably shortened while maintaining the same validity. Graphs 
have been used to aid in combining test scores with previous academic record and expectancy tables 
have been developed to aid test users in interpreting predictions, The planning and interpretation of 
validity studies should provide for:aampling institutions according to a defined plan, using test scores 
jointly with other predictors, studying homogeneous student groups, avoiding capitalization on chance, 
taking account of restrittion of range of talent, using a regression approach ‘wherever possible, and re 
porting results simply. Criterion development is a promising though difficult field. Broadening of the 
criterion to include other major outcomes of college and professional education than those reflected in 


grades is needed. Also needed is long-range validation of the tests administered in the 1920's against 
criteria of adult success. 1 


Statistical Principles of Testing. Јонх MANDEL 4 


‘The answer to many problems in science, both fundamental and applied, is found in small differ" 
ences in the numerical values of a few measurements. In determining these differences, it is necessary 
to guard against biases introduced by the testing procedure. One of the principal functions of statistical 
design is to detect and neutralize these biases. Examples are given to illustrate this approach: 

(1) In an experiment made for the purpose of testing the homogeneity of 4 batches of polyisobutyl" 
ene by means of flow-vicosity measurements, a 4 X4 latin square was used. Considerable day- 
variation, as well as systematic effects of chronological order within days were found. Correction of the 
data for these systematic effects permitted an evaluation of the degree of homogeneity of the mate! 
within prescribed limits of uncertainty, that would not have been possible without statistical design 
(2) In many cases experiments run in parallel display much greater agreement than experiments FUP. 
at different times or in different laboratories. This fact can be used to increase precision by the 
known device of the "control sample.” Statistical methodology has broadened the idea of the control 
sample to include the concept of the statistical “block,” thereby eliminating the need for an 80 
control sample in many instances. The idea is illustrated by a road teat of eight automobile tire b! 
for rate of tread wear, run in accordance with a chaifi-block design with two-way elimination of hetero 
geneity. Both ron-to-fun variation and the effect of wheel position are eliminated from the com) 


SUMMARIES OF PAPERS 341 


of the brands, (3) In interlaboratory experiments, the specimens for test are usually allocated entirely 
atrandom among the various laboratories. A judicious utilization of the idea of the experimental “block” 
can sometimes increase the precision several fold. Sheets of vulcanized rubber often show considerably 
more variability than that observed on any one sheet. In an aging study of rubber, carried out as an 
interlaboratory experiment, the information per measurement might well have been increased six-fold 
if the specimens necessary to obtain individual aging curves had been taken from the same sheet 


Some Comments on the Lot Plot Plan. L. E. Moszs, Stanford University. à 
The lot plot plan of sampling inspection was conceived by Shainin as a method to give a high degree 
| of protection against acceptance of lots with fraction defective in the vicinity of 1/10 of 1%. The plan 
iş supposed to be effective regardless of the character of the lot. It has found wide adoption in many 
industries in this country. Little is known of its theoretical basis. The plan calls for taking a random 
sample of 50 observations and plotting the histogram for the sample, as well as the mean and average 
Tange (from the 10 sets of 5 observations). From the appearance of the histogram the inspector decides 
whether to view the lot as “normal,” “flat-topped,” “long-tailed,” “bimodal,” “skew,” or “truncated,” 
For each such type there is a somewhat different way of deciding upon the acceptability of the lot. 
4 This paper presents indications that the operating characteristics of the plan depend markedly 
‚= the character of the lot; that there is little hope of using а 50-observation histogram to detect“ small” 
_ departures from normality which greatly distort ће О.С. curve. The special types of analysis preacribed 
-for skew and bimodal samples are considered, and found to have rather unsatisfactory properties in 
general, 


_ Continuous Sampling Plans. Harry WEINGARTEN. x 
Continuous sampling plans (for acceptance inspection of material not assembled in “lots”) available 

in the literature, were designed for a manufacturer interested in keeping a check on his production and 
also guaranteeing an AOQL. The plan proposed by H. F Dodge, “A Sampling Inspection Plan for 
Continuous Production” in the September 1943 issue of the Annals of Mathematical Statistics, is such 
-a plan. Dodge's plan requires the alternation of 100% inspection and sampling. Two quantities i and f 
Aro specified: if after starting with 100% inspection, i consecutive items are found free of defectives, 
Sampling at the rate f begins, If a defective is found during sampling, the inspection returns to 100% 
_ Until again i consecutive items are free of defectives, eto. For a fixed valuo p of theincoming per cent 
defective, i and f determine the АООТ, 

| Emphasisin adopting the Dodge Plan for use by a purchager (Bureau of Ordnance, Navy Depart- 

Ment) was placed on reducing the amount of inspection required. If poor quality were submitted for in- 
Spection, the purchaser would, in effect, perform a screening operation for the manufacturer. The 
Adaptation utilized the rejection number concept of lot acceptance plans; thus the Dodge Plan is oper- 
Ated as long as less than a-+1 defectives are found. When inspection is interrupted, upon finding the 
E defection, it is not resumed until the manufacturer finds and removes the cause for defective 

luct. B e » 
A sampling plan in Navord Standard 81 is defined by four quantities, N anticipated volume of 

Production, i length of 100% inspection requirement, / «sampling rate, and a maximum number of 
defectives allowed before interruption. This generalization of the Dodge Plan (Dodge Plan: N= ®, 
a= *) produces mathematical problems of great complexity. If the effect of a plan defined by N, 1, f, 
"and a is described by : 


Ly =Р (not interrupting inspection), 
it was possible to find Zp exactly, only for a =0, 1, 2, For larger values of a an approximation was used. 


"Problems of Coordinating the United States Statistical System. Бтсант A. лок. 
The statistical system of the United States embraces many official, semi-official, and unofficial 
Agencies and instruments, among which some degree of coherence ів maintained by item to item adjust- 
Ments among related tasks and processes. The relationships may be those of conceptual congruity or 
we of consistency in operational patterns and sequences. The process of maintaining coherence is 
Шей coordination. In the decentralized statistical system of the United States particular problem 
Thich falls to the Office of Statistical Standards as the Government's central agency of statistical co- 
audination is that of representing the general interests. These often override of lie between the interests 
resPonsibilities of particular statistical agencies. Although a cross-cutting “statistical budget” is 

ran 0001 by OSS each year, its components must be fitted into the over-all budgets of departments 
alode, thus raising special problems. Other important problems of coordination are the establishment 
balance among separate agency programs, of establishing lines of demarcation between EEO 
rong. Fovernmental responsibilities and of, finding the appropriate boundaries to the hosp Т 
dentiality" and national security interest in data. Finally, the problem of liaison between Federal 

br NU 


A © 


342 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


statistical agencies and the statistical profession is being solved in piecemeal and modest but satisfactory 
ways through advisory mechanisms established by the American Statistical Association. 


Problems of Co-ordination in the Canadian Statistical System. HERBERT MARSHALL. 

Canada's statistical system is highly centralized. The Dominion Bureau of Statistics was set up 
under a Statistics Act in 1918 and instructed to “organize a general scheme of co-ordinated social and 
economic statistics pertaining to the whole of Canada and to each of the provinces thereof.” To reach 
this objective several types of co-ordination were necessary. Since Canada is a federal state the raw. 
materials for some kinds of statistics are derived from the administrative records of both federal and 
provincial departments, These administrative records have to be fitted into the general overall scheme. 
Co-ordination with provincial departments is achieved mainly by annual or less frequent, Dominion- 
Provincial Conferences in various statistical fields. The bulk of the Bureau's output is based on informa- 
tion collected directly rather than from administrative records and includes censuses of population, 
agriculture, industry, distribution, labour and prices, and numerous other fields. Co-ordination with 
those who fill in the questionnaires and those who use the data inyolves close liaison with numerous 
business and other organizations and wide variety of users of the data. Co-ordination of the work 
done within the Bureau of the fourteen Divisions and numerous sections is of great importance. Uni- 
formity in concepts, definitions, classifications, avoidance of duplication, and other aspects of co- 
ordination require constant vigilance on the part of interdivisional committees, and other supervisory 
efforts, 


Methodological Problems and Findings of Study of Recipients of Old-Age Assistance. Tuomas G 

Ноттом, Department of Health, Education, and Welfare. 

The major methodological problem faced by the Bureau of Public Assistance in making the study 
of old-age assistance recipients, as in all its studies, is how to collect data that are comparable from 
State to State when the Bureau must conduct its studies with personnel of 53 State public assistance 
agencies each operating under its own, and frequently diverse, policies, and with case workers in over 
1,000 governmental units scattered throughout the length and breadth of the land, These workers have 
diverse backgrounds and varying degrees of technical skill. Contacts with State personnel and local case 
workers аге, almost entirely through the written word, and the Bureau is dependent on obtaining the 
cooperation of these workers in order to obtain data that are reasonably accurate. The solution to this 
problem while not a statistical one, but one in the field of human relations, is an integral part of making 
such a study. The paper discusses how this problem was solved. 

The study of old-age assistance recipients was designed to obtain answers to two questions: (1) Who 
are these old people that are receiving public assistance, and (2) How do they live. Answers to the first 
question were obtained in terms of certain social characteristics; age, sex, race, marital status, and 
physical ànd mental condition. Answezs to the second question were obtained through information on 
living arrangements, extent of home ownership, certain housing characteristics relating to extent of 
overcrowding, sanitary faéilities in the home, and thè use of modern convenieaces such as electricity, 
the telephone and refrigeration. The degree to which children contributed to the support of aged parents, 
the amount and sources of income other than assistance, and the total amounts of income on which 
these people lived also contributed to answering the second question. The findings of the study on each 
of these points are presented in brief compass in the paper. 


Methodological Problems and Findings of Survey of Aged Beneficiaries of Old-Age and Survivors In- 
surance. Ерма C. WENTWORTH, Social Security Administration 
In the fall of 1951 the Bureau of Old-Age and Survivors Insurance conducted a Nation-wide survey 
of the economic resources of retired worker and aged-widow beneficiaries, The paper discusses the meth- 
ods used in evaluating the economic situation of the beneficiaries and some of the findings for those 
who received benefits throughout the survey year. Slightly over three fifths of the beneficiaries had in- 
come in addition to their own independent retirement income, or they used assets. The chief sources of 
the additional income were public assistance, contributions from relatives outside the houschold, and 
earnings, usually from short-time employment. Two thirds of the beneficiaries either received public 
assistance ог had less money income from all sources than publie assistance would allow its recipients 
who lived alone in rentéll quarters; a-sixth were assistance recipients, Half of those with the low in- 
comes and no public assistance got along because they lived with relatives and were partially supported 
by. them. Some of the others had noncash income of various kinds. Altogether, half the beneficiaries were 
partially dependent during the survey year, more of them on relatives than on püblic assistance. | 
If they used one tenth of their liquid assets each year, most (70 per cent) of the retired workers with 
the highest benefite—$60 to $68.50—would have indepélident retirgment funds for the next 10 years of 
at least $900 a year if sitizle and $1,500 if married Only 36 per cent with benefits of $50 to $59 would 


. a> 


SUMMARIES OF PAPERS 343 


have such independent retirement funds. Those with smaller benefits were worse off. Homes were owned 
by 45 per cent of the beneficiaries but the homes are not taken into consideration in this appraisal of 


beneficiary resources. 


Intervals Between Onsets of Multiple Cases of Poliomyelitis in Families. Антнов 8. LrrrELL and 
Gxonaz У. Surra, Western Reserve University. 


Recently P. E. Sartwell bas shown that observed incubation periods of poliomyelitis are described 
by a log-normal frequency distribution. Using this distribution of incubation periods an expected dis- 
tribution of intervals between onsets of pairs of cases exposed simultaneously was computed, Six ob- 
served series of intervals between onsets of initial cases and onsets of subsequent cases within families, 
1310 intervals in all, are compared with the distribution expected if all cases in each family resulted 
from common exposure. The agreement between observed and expected distributions is very good in 
some of the series. In the others, the disagreements are opposite to what one would expect if there were 
an appreciable proportion of secondary infections within families. Instead of an excess of intervals 
longer than 7 days, which would represent secondary cases, there is a deficiency. 

It is shown that, even if non-susceptible carriers are partially responsible for the spread of polio- 
myelitis, an observed distribution of intervals should show an excess due to secondary cases of intervals 
longer than 7 days over the number expected on the basis of common exposure. This argument points 
to one of two conclusions; (a) when multiple cases of poliomyelitis appear in a family, the source of in- 
fection is common, and spread of the disease from case to susceptible within the family is rare; or, (b) a 

ase of poliomyelitis can be infectious very early in its incubation period. 


Unsolved Problems of Experimental Statistics. Joan W. Tuxer, Princeton University. 

It would not be misleading to say that there is only one unsolved problem of experimental statistics 
—“How can we identify the problems of experimental statistics?” (We can identify a good many un- 
solved problems by accident, but we probably miss many important ones for far too many years.) Ex- 
perience to date indicates that difficulties in identifying problems have delayed statistics far more 
than difficulties in solving problems. This seems likely to be the case in the future, too. 

Thus it is appropriate to be as systematic as we can about unsolved problems. Any system may 
be a start toward a partial solution of this one central unsolved problem. We shall try to do this by 
stating first some hypergeneral principles and then some general consequences. We shall strive to phrase 
these as generally as possible, in the hope of prolonging their useful life. The discussion of examples of 
these 18 general principles will set forth a number of unsolved, problems, while a list of 37 provocative 
questions poses many more. The account closes with a discussion of the possibility of orienting experi- 
mental statistics toward problems rather than techniques. 

Some general principles. If we feel that the detailed problems of experimental statistics arise from 
the interaction of certain general principles among themselves avd with classes of experiments, it is rea- 
sonable to try to state and illustrate some of these principles. Most of these hang on four hypergeneral 
Principles, which may Seem harmless until we соте to their consequences, &amely: (A) Different ends 
Tequire different means and different logical structures. (B) In each area, statistical method must and 
does evolve, mainly by adding both immediate ends and considerations. (C) While techniques are im- 
Portant in experimental statistics; when to use them and why to use them are more important. (D) In 
the long run, it does not pay a statistician to fool either himsélf or his clients. 


Characterization of Distribution-free Statistics. Z. W. BIRNBAUM. 

1. Definitions. Let Q and Q' be families of cumulative probability functions. A real quantity 
W=8(X,, Xy ++ - | Xn, G) is a statistic in Q with regard to Q' if, for any Ge and Xy, Xa, * * * , Xn in 
the n-dimensional sample space Zp of a random variable X with the c.p.f. Fe’, this quantity 1^ is defined 
т зун in E, and 2 has а probability distribution; this probability distribution is then de- 

y TiSQG,***, Xn, @); Е]. 
,., I 8-0" and the statistic я «++, Xn, б) hasthe property that P[S(Xi, * * * , Xm @; G] is 
independent of G for Gen, then S(Xs, = * + , Xn, 6) is а distribution-free statistic in 0. 

Let 0 be a family of c.p.f.'s such that the inverse function GC?) can be defined for each бей, A 
statistic SCY, +++, Xn, G) in Q witlPregard to a family 07 is called strongly distribution-free if 

Xi," * * , Xn, 0); F] depends only on т GC for all Get, Fet. ° v 

If there exists a function defined on the n-dimensional unit cube and. symmetric in ite arguments 
Hh that for any бей, Fed! we have S(Xs,* * * , Xm G) =Ф[@ОС), + * 7» G(X«)] almost everywhere in 
an for the random variable X which has o.p.f. Р, then (Xi, * * * Xn, G) is a statistic of structure Oin 

With regard to Q', ү 

2. Problems. Some of problems whith may be formulated in of the concepts defined 

Above are: (a) given 0, Foe eee cng of atatistice which are diatributidi-free in 0, or strongly dis- 


ee e 


344 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1984 


tribution-free in Q with regard to a given 0’; (b) given 0 and a loss-function L, determine the distribu- 
tion-free statistic which is optimum for a specified decision-theoretical problem. 

‘The following are some results obtained by the speaker and H. Rubin for problems of type (a): Ifa 
statistic in the class Q: of all continuous c.p.f.’s with regard to 0, has structure (d) then it is distribution: 
free in M; but not every statistic of structure (d) in Q: with regard to 0, is distribution-free in ФА 
statistic in the class Q* of strictly increasing continuous c.p.f.'s with regard to Q* is strongly distribution- 
free if and only if it is of structure (d). For problems of type (b) O. P. Aggarwall has recently shown 

` that for estimating a cumulative probability function the minimax procedure invariant under all per- 
mutations of the sample is, for certain plausible loss functions, either that based on Kolmogorov's 
statistic or on some modifications of that statistic. 


Inadequacies of the Construction Estimates as General Economic Measures, Warrer E. Hoane, 

JR., Armstrong Cork Company, 

Construction merits general recognition as a major American industry, but few leaders within this 
diversified industry show much interest in aggregate activity statistics, and most “outsiders” have во 
little first-hand knowledge of construction that they tend to accept almost without question any avail- 
able measures of construction, especially from “official” sources. 

At least five general inadequacies stand out in the construction statistics field at present: (1) & 
widespread lack of understanding and appreciation of the importance of adequate construction statistics 
to public and private policies, (2) weaknesses in the basic construction series currently available, (8) 
major gaps in construction statistics, (4) the absence of a comprehensive plan to strengthen government 
statistics in the construction field, gnd (5) insufficient funds to insure suggested improvements can be 
carried out successfully. As construction activity has grown in size and importance to the whole eoon- 
omy, our knowledge about it has lessened, This unfortunate condition is not due to lack of competence 
in the governmental statistical agencies, which are doing just about the best job that can be done with 
the resources available. It is due primarily to the failure of Congressmen, businessmen, and others to 
recognize the importance of sound data as a basis for market planning or to recognize just how inade- 
quate the present data are for this purpose. 

Since the Korean War, for the first time a construction estimate series—the nonfarm housing start 
series of the U. S. Bureau of Labor Statistics—was written into law as a "trigger" statistic to help time 
changes in a government control (i.e., residential credit). This last move, perhaps more than any other 
single development, stimulated new interest in government statistics across the home building industry. 
Government construction specialists als quickly recognized the need to reexamine the adequacy of their 
statistics in light of their increased role in policy determination. Nevertheless, few major policy decisions 
affecting the construction indystry can now be properly made on the basis of the statistics available. 

The most encouraging development this year has been the extent to which policy level people 
within aid outside government have become aware of many deficiencies in construction statistics and 
have started to promote plans designed to improve several key series. 

The three principal sfatistical gaps are: (1) Current indications of changes in the national housing 
ventory, reflecting the continuing impact of new buildings, conversions, and demolitions. (2) Esti- 
mated vacancies in old and new dwellings. (3) Estimated expenditures for “fix-up” purposes (i.e., re- 
pairs, additions, and alterations to existing construction). These data are absolutely essential for the 
housing and construction segment of the economy to prosper and avoid major swings. 

The pressing job to be done in government—with the aid of all persons interested in construction 
—anust be to obtain some general agreement as to the basic objectives of government in the construction 
field. Government policy officials must, decide what they feel they need to make sound decisions in the 
interest of growth and stability in the ecomomy. Only when the major deficiencies and gaps in “public 
policy” statistics on construction have been substantially eliminated should attention be directed to 
meeting the numerous and ever-present requests for more detailed market type information—except of 
course as such becomes available as a by-product of public policy statistics. In my view, the construc- 
tion industry and the economy as а whole have much more to lose from inappropriate public policies 
than from the absence of fairly detailed information about particular segments of construction markets 

‘There is reason to believe that construction statistics programs will be strengthened in the months 
ahead—not simply because of our wishes, but rather because “public policy” construction statistios— 
on their own merits—paomise to be recognized more widely as far too important to public policy dè 
termination to be allowed to remain inadequate or to deteriorate further. 


ә 
Analysis of Variance Models in Sedimentary Petrology. J. C GnirriTES. z, 


Investigation of the petrography of sedimentary rocks by analysis of variance already embraces 
completely randomized, randomized blocks, unbalanced! lattice and factorial designs. The hierarchy of 
petrographic units represented by Tock-types, formations within rock-typs, outcrops within formations, 


| 


| SUMMARIES OF PAPERS 345 


specimens within outcrops and aliquots within specimens leads to multistage sampling patterns and 
completely randomized designs. 

Interacting factors and randomized blocks design may be introduced by using two or more spatial 
dimensions analogous to “rows and columns" of agronomic experiments. In contrast, however, the aim 
of such arrangements in the analysis of sediments is the evaluation, and not solely the segregation, of 
this “fertility gradient”. As another interacting factor order may be introduced either as a stratigraphic 
variable or as a sequential arrangement cf a series of experiments. Equivalent to the “variety” effect 
in agronomic investigations, observers performing the same experiments, leads to evaluation of operator 
variation, In this connection operator inconsistency or “discrepance” has proved of vital interest in set- 
ting the sensitivity level of many petrographic investigations. 

Heterogeneous variance has complicated many of the comparisons but by estimating components 
of variation it has been shown that mean differences are, perhaps, less important than differences in 
variability in characterizing sedimentary rocks and the processes by which they are formed. Equal 
sampling based on the assumption of equivalence of classes, as for example in using, arkose, graywacke 
and quartzite, has proved fallacious end introduces heterogeneous variance. Sampling in proportion to 
variability appears to be the obvious solution as emphasized by some recent analyses of variance of 
sphericity and roundness of quartz grains in sedimenits. 


ЕЯ Testing the Association of Mineral Occurrence with а Set of Observable Characteristics, Ray 
Mickey, 


Let а topographic map M be given for an area to be explored for the presence of a set of minerals, 
1% V be the vector function defined over M which associates with gach point of M a set of values of 
Geological characteristics, and let Z be the function over M which assumes the value one for those 
points whose mineral density is sufficiently large and which assumes the value zero otherwise. Let z be 
random point in M whose distribution has constant density over M. We will say that V and Z are 
_ 'autociated if V (z) and Z(z) are not independent chance variables. 
We consider the possibility of testing H s, the hypothesis of no association, by the use of conditional 
- testa of the following class. Let ¢=t(Vi, * * * , Vn, Zu * * * » Zn) be а test statistic with observed value te. 
_ Conduct a series of sampling experiments in which Vi, + У, and DZ; are held fixed and Hy is as- 
ў fumed. The test is conducted by the use of a binomial sampling plan. Perhaps the simplest of these of 
“level 1/& are: тејес Hs if 


mbi 
Dusm-l, * 
- 


ioe wi -0 it o value of t obtained from the ith sampling experiment igless than ts and шщ =1 other- 


I 
п 


An example ів presented in which use is made of the statistic ^ 


$ t= max rin p(V;, Vj) 
ў 18. ја 
"Whore Sa is the set of indices for which Z; =a, and p is a suitable metric. 


Problems in Sampling the Phosphoria Formation. R, A. Guupranpsen and V. Е. McKeuvey, U. S. 
Geological Survey © 
_ ,,. The Phosphoria formation of Permian age contains vast resources of phosphate and other materials 
distributed over ап area of 135,000 square miles in eastern Idaho and adjacent states. In an effort to 
"goPraise the total mineral resources in the formation and to determine its origin, the U. В. Geological 
 Бшуву has smpled the full thickness of the phosphatic members of the formation at about 200 scattered 
ires. The sampling plan was based mainly on geologic intuition, or geologically educated judgment; - 
‘the bapo X account of previous information available on the variation in composition and thickness of 
E the distribution of the rocks and their outcrops, and the purposes the samples were expected to 


АП samples collected are being апа ед for phi constituents of prime economic 
osphate and other ueni 

“ates of ane” It is expected that the average of five to fifteen widely spaced saxiples collected from an 

i, Г a few tens of square miles will represent the grade and thickness of deposits in that area and that 

Tange in grade and thickness i will indicate the magnitude of variation to be expected 


346 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


Longitudinal Studies of HIP Experience, Background and Some Findings of Pilot Study. Nuys R 

Dzanponrr. Health Insurance Plan of Greater New York. 

‘The longitudinal studies now in progress at the Health Insurance Plan of Greater New York are 
essentially examinations of the experience of people covered for comprehensive prepaid medical care 
for periods of time up to a maximum of four years. Three phases of the experience of these people аге 
being examined: their enrollment experience, their utilization, and the diagnoses reported for them. 
The happenings to a constant group of people can be reported from year to year and the changes noted 
in the group's pattern of experience, irrespective of the sequence of changes experienced by any one 
person or family. Another focus of interest is the movement of persons and families from one category to 
another over the years. 

Longitudinal studies point toward the treatment of each person’s history rather than his condition 
at a given time, as the unit of statistical manipulation. In a sense, it isthe statistical approach to a group 
of persons or families, each with a continuing life history to be considered, rather than the construction 
of cross-section views of these people's characteristics made from time to time. This Project was planned 
and is being carried out under the direction of a committee comp8sed of persons who have no affiliations 
with HIP, implemented by a specially employed staff. The work of the Project is financed by grants 
from the Commonwealth Fund and the Rockefeller Foundation. 

Among the requirements for effective longitudinal studies is the need for a comparatively largs 
number of subjects since almost all groups are subject to attrition from death and other causes of losa of 
subjects and since patterns of change tend to proliferate with each added year of observation. (A dia- 
gram illustrating these points was presented.) The HIP studies are based on random samples comprising 
8,025 persons who were under observation from 1948 to the end of the study period (December 31, 
1951), 9,325 since 1949, and 3,500 since 1950, Besides these, there are 8,000 persons whose coverage was 
terminated or had been interrupted or who were spouses and children acquired in the last year (1951) 
of the study period. These 8,000 cases have been included not only because of their place in annual 
tables, but also for purposes of some special analyses on enrollment, utilization, and diagnostic condi- 
tions. Altogether a total of 28,850 persons are included in these studies. Three tables from a pilot study 
were distributed to illustrate a few types of analysis which are being employed in the presentation of 
data, 


Longitudinal Study of Health Insurance Plan of Greater New York. Naruan Gouprarn, Health In- 
surance Plan of Greater New York. 


Although there is an increasing recognition of the particular advantages of longitudinal data a8 
contrasted to cross-sectional data, there has been only limited experience thus far in the organization and 
Processing of large masses of data for longitudinal analysis, It would, therefore, be extremely useful for 
those who are interested in longitudinal analysis to discuss the problems and methods of organizing and 
analyzing longitudinal data. г 

‘There are two basje sources of data for this Special Research Project, namely, (1) the documents 
completed by the physician and submitted to HIP at regular intervals, which identify the individuals, 
and give the tentative diagnosis and other related information, and (2) the registrar records which 
provide the enrollment data and the personal identifying characteristics for each individual. In order 
to determine whether the physicians’ reports were being prepared with sufficient caro for the needs of 
the Research Project, the data from these reporta were checked for correspondence with the records in 
the physicians’ files and found satisfactory. The 10 por cent sample of families and individuals covered 
by the HIP for which all the Project data are being obtained, was also tested and found representative for 
в number of characteristics, ў 

The objectives of the Project were framed in a series of questions. It was evident that а history 
and summary of the medical experience for each individual for the entire period of the study was T% 
quired. The Project then set about to summarize data covering approximately 30,000 people, with about 
315,000 medical services, for a period ranging up to four years. Coding and tabulating methods wer’ 
evolved which permitted an examination of the medical experience and care for each person in the 


sample. Codes were developed to give the detailed diagnoses within each study year, cumulative oe 
nostic codes to summarize the total diagnostic experience for an entire year, and still another set 

cumulative codes to summarize the diagnostic experience for combinations of successive yeat® 9 
coverage—2 years, 3 Years, and 4 years. Chronic diseases were classified separately from the non-chroni? 
diseases, Every medical condition was coded, but there was no segregation of “primary” and “secondary 
diagnoses, nor was there any count of episodes. These decisions were in part dictated by the nature © 
the reporting. Where a diagnostic description was evolved over a period of time, the “best” diagnosi 
Was chosen on the basis of a set of rules which, for example, gave priority to the specialist's report ove? 


a the general physician, and to the later diagnosis over an sarlier one in a sequence of visits to the 
ioctor. 


| 


SUMMARIES OF PAPERS 347 


‘The tabulations on utilization will show the trend for various cohorts of individuals indicating how, 
with the paseage of time, the factors of age, sex or a diagnostic type affect the quantity of medical care. 


“Spot Checks" in Lieu of Complete Censuses. Howarp C. Grieves, Bureau of Census. 

In the summer of 1953 the Congress indefinitely postponed the Quinquennial Censuses of Manu- 
factures, Mineral Industries, Business and Transportation. It substituted the sum of $1.5 million for 
“spot checks” in the fields of manufactures, business and agriculture. The appropriation is being used 
for three broad purposes: (1) to fill some of the gaps in national aggregates created by the deferral of the 
complete censuses; (2) to increase the timeliness of reports; and (3) to develop some new statistical 
indicators. 

Specifically, sample surveys will provide national totals of wholesale, retail, and service trades for 
the year 1953. The Annual Survey, of Manufactures for 1953 will also be carried out. Monthly estimates 
of retail sales and the labor force are being improved by increasing the number of primary sampling 
areas, The monthly wholesale series will henceforth utilize a probability sample developed for the 1953 
Annual Survey. The publication County Business Patterns is being compiled on a greatly accelerated 
time schedule covering 1953. Advance estimates of retail sales by major kinds of business employing a 
probability sample are being released on the 10th day following the month covered. A complete census 
of retail trade, using an abbreviated report form, is being taken in Dallas on a trial basis with financial 
absistance from local business. Finally, an effort is being made to improve the monthly measures of re- 
tail inventories. A number of other smaller projects are also in progress. 

While much useful data will be forthcoming it is important to note some of the purposes of complete 
censuses which cannot be achieved with the resources available. The most important omission will be 
statistics for small areas, cities, counties, etc. Similarly, detailed characteristics, fine industry break- 
downs, commodity statistics, etc., cannot be provided. The planned improvement of the samples in cur- 
rent use based on enumerations of the entire universe correctly classified must be deferred. Similarly, 
the planned coordination of Federal statistical series to achieve uniform classification is postponed until 
the complete Censuses are taken. 


рни in Collection and Processing of Mass Statistical Data. Morris Н. Hansen and WILLIAM 
+ Hurwirz. 


ze Large-scale census and survey operations offer numerous opportunities for the application of sta- 
tistical and other scientific methods. Such applications are now frequently referred to as operations re- 
search, The general aim in developing and improving methods is to produce the needed results at mini- 
mum cost, and within specified time schedules and other restrictions. It involves making the fullest 
Possible use of available resources. 

Modern sampling methods make it possible to provide timely results of known precision at low 


7 Cost through monthly, quarterly, annual, or special sample of the population, retail trade, 


manufacturing, and other subjects. Also, sampling methods are applied in the major censuses to collect 
snd tabulate much of thecbasic information at reduged cost, to produce more gimely results, to check 
on the quality of the census results, and in other ways. 
vonta Portant uses of statistical methods in connection with both censuses and current work is in the 
pena of quality. Quality control usually consists of a system of sample inspection and process control 
ni luced to insure at low cost that the final product will meet certain quality standards. Quality check 
aroe are to be distinguished from quality control. Quality checks have been introduced in situa- 
ns where it has not been feasible to control quality. They involve the use of intensive measurement 
Methods on small samples, and have been used especially in evaluating the quality of tho field work in 
major census operations. Not only do they guide in the proper use of the results, but they provide the 
Te improving the design of future censuses and surveys. б 
Ана areas for the application of statistical methods include evaluation of alternative procedures 
or trae er Perly designed experiments, such as alternative types of questionnaires, collection processes, 
ining methods. Also, the effects of editing and other operations have been investigated and modi- 
tenn mechanical techniques substituted. Our work has indicated that much editing formerly regarded 
ton can be eliminated or reduced with only a trivial effect on the accuracy of results. 
bet other area for application of statistéeal methode is in setting standards or incentives for personnel 
mance, Very substantial production gains have been achieved simply by istrative incentives 
tions Performance standards set on the basis of appropriate samples of and sample observa- 
Te Mid 
n e scale electronic computi juipment is being а] > 
ae future, as will, also mark dig or reading methods. The introduction of such equipment, to- 
dis Ne sampling, quality control, techniques for measuring and controlling response errors, es 
ethods mentioned should resulf in substantially increasing the timelinesspf major census res 


A 
co $3 


348 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 19% 


as well as reduce their costs, The application of statistical methods in the census has been the principal 
work of a small group of versatile scientists interested in applications. They are trained in mathematical 
statistics, survey methods, psychology, and other techniques, and are given complete authority to ask 
questions and investigate and recommend, but are kept free of operating and responsibilities, 


The Current Statistics Program of the Census Bureau. A. Ross EcKLER and CONRAD Taxvnzn, Bureau 
of the Census. 
Much of the current statistics program of the Bureau of the Census exists to serve needs not met 
by the censuses, nor by reports which are a byproduct of government administration nor by reporta based 


used not only for carrying out the Bureau's own program, but also in performing similar tasks for other 
governmental units and for private agencies, when proper safeguards concerning the public use of the 
results can be assured. Throughsthe increasing use of data available from administrative records and 
through extension of services, where appropriate, to other governmental, as well as to non-govern- 
mental, agencies the Bureau of the Census may be successful in offsetting some of the losses in statistical 
information resulting from program reductions in recent years. 


Preliminary Tests and Pool Rules. T. A. Bancrort, Iowa State College. 


Certain applications in experimental design and regression analyses involving preliminary tests 
of significance can be arranged as special cases of a test of a general linear hypothesis in canonical form 


complex tests. Bechhofer, using exact methods, was able to evaluate the integrals involved in the ex- 
Pression for the power and size of such tests in closed form for only a few special cases of the parameters 
involved, In order that information on Power and візе of such tests be explicitly available, certain ap- 


An Extension of Preliminary Tests for Pooling Data. D. Y. Нохтввевокв, Гоша State College. 
Mosteller (Jour, American Stat. Asan., Vol. 43, 1948) has shown that, if X; and X are independent 
estimates for the means pı and дз of two normal populations with the same known variance, the “some 
times pool” estimator for шз based on a preliminary test of the hypothesis ji = us is subject to large losses 
of efficiency, relative to X», for some range of y =m us. In this paper it is shown that, И T =(Х\—2 
Vn/A/2a is used for the preliminary test, the estimator W(T) =4(7)Xi+-(1 —9) £433) /2, where 
#(7) ів a continuous function of T, provides a marked reduction in the maximum possible loss of 


Simultaneous Test of Linear Hypotheses by Analysis of Variance Methods. M. N. Оновн, University 
of North Carolina.” 


iso parameter ette тугу уре of data, the testa for diferent hypotheses отери 
to sets of parameters are not usually independent. A notion of quasi-independence has been devel: 


ый б 


OF PAPERS 
mice level of such а test is defined as the probability of rejecting at least one of the hypotheses 


ss are non-orthogonal. These methods are applied to analyze growth data for children espe- 
"relation to stage of sexual maturity and certain blood chemicals. 


‘Known that, for some problems which are appropriately evaluated by analysis of variance and 
nee methods, the procedures of univariate analysis must be generalized to the multivariate case, 
why such generalizations are necessary have been made quite explicit by H. Hotelling (1947) 


matrix variate instead of a single variate, In particular, the multivariate normal model is as- 
| the estimation space is specified, the normal equations defined, and the estimates of the param- 
sare made. Hypotheses on the multivariate parameters are given and the least-squares process is 

to determine the sums of squares and cross-products which are appropriate for testing the 
шев. A numerical example is carried through to illustrate all of the calculations necessary to 
the parameters and make tests of significance. The method of analysis is quite similar to the 
late calculations, 


ng Coefficients for Age-Adjusted Death Rates. HARRY Surra, Jn. University of North Carolina. 


problem of finding a descriptive mortality index for a community is reviewed. The methods for 
death rates for age are discussed in terms of the criteria for the use of each. The advantages 
vantages of each are also listed. A new method of adjusting death rates for age is proposed 
the comparison of two communities is to be made. The assumption upon which this new method 

that the mortality functions of the two communities are of the same shape and separated by 
nt differential. Any overlapping that occurs is assumed due to sampling fluctuations. The solu- 
based upon the method of maximum likelihood. A comparison of the results using this new 
xL with results of other methods is shown. The use of a discriminant function in determining those 
most influential in discriminating between states in the past ten years is also studied as 
‘possible means of calculating an over-all mortality rate. 


D 
tistical Evidence on Economies of Scale. Freperick T. Moors, Bureau of Mines. 


tistical evidence on economies of scale in manufacturing is scarce because of the necessity of 
г detailed cost studies on plants, or, on an engineering basis, computing production functions. 
ease the procedure is tedious. Engineers have derived certain rules of thumb for evaluating the 
inship between capital costs and capacity. One such rule is the *.6 factor" rule which states that 
Itrease in capital cost is given by the increase in capacity raised to the .6 power, For example, the 
i06 area of a spherical tank (which essentially measures the cost) increases as the volume of the 
(Capacity) to the two-thirds power. The *.6 factor" has been applied both to process equipment 
l complete plants. Tt is apt to fit best industries which are (a) capital intensive; (b) continuous 
tion; and (c) with a homogeneous standardized product. The chemical and mineral processing 
in general meet these criteria. € 
sts wore made on samples of plants from several industries, using the formula: log Ё «log a 
! where E =capital cost; С annual capacity and а and b are constants. Values of b «1 indicate 
mies in capital cost and since operating costs tend to behave in the same fashion, this also indi- 
8 economies in scale. In the tests the following values for b were obtained: alumina .95; aluminum 
d 98; aluminum rolling .88; aluminum extrusions 1.0; cement .77; tonnage oxygen .63. Testa 
made for individual processes in these industries. A separate study of the production function 
bd Pipelines indicated economies of scale up to approximately 200,000 barrels of throughput 
ў Constant returns for larger throughputs. ч 
Suggested that а simplifledemethod for evaluating economies of scale can be visualized in 
eps: (1) extensive engineering information is available showing the cost of process equipment 
I6 to some engineering magnitude (e.g., square feet of heating surface); (2) the engineering mag- 
of a tank can be related 


be related t ity b; inte formula (e.g., the capacity р 
Е отау бузар «ОС above, cost can be related to capacity. m 
combined to show а complete plant. 


Я 


350 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1994 


Stability of Technical Coefficients: Evidence from Inter-Plant Differences in Labor and Materials 

Productivity. A. Рнпллрв. 

Technical coefficients for Leontief matrices appear to be quite stable through time. This may arise 
from two distinct conditions. First, there may be a relatively stable industry technology which causes 
each plant to have approximately the same coefficient. Random influences affect the plant coefficients 
but tend to offset each other, resulting in stable aggregate industry coefficients through time. On the 
other hand, industry coefficients may be stable even with very. different plant coefficients so long as the 
plant coefficients are themselves quite stable and the plants produce consistent shares of the total 
industry output. 

Evidence from one industry shows the latter to be the case. Little central tendency of plant coeffi- 
cients is found. But each plant tends to have the same inputs of labor and materials per unit output each 
year and produce the same relative share of industry output. Differences among plant coefficients appear 
related to other factors such as principal product type, integration and kind of material consumed, 
though most differences cannot be explained with available information. Since each plant may have ita 
own technology which is properly part of the industry, technology is not ruled out as the cause of 
stability, though one fairly unique and stable industry technology does not appear to be the cause. 


Adequacy of International Trade Statistics for Economic and Business Analysis. J. EpwAnp ELY. 


There are important conflicts between the various users of international trade statistics as to what 
the statistics should and can do. These conflicts have been resolved in different ways in different coun- 
tries with the result that countries follow widely varying definitions and practices in compiling their 
foreign trade statistics. It follows from this that what may appear from the comparison of the statistics 
of trading partners to be errors in the statistics, are frequently differences caused by differences in defi 
nitions, ^ 

International trade statistics face a unique problem in the fact that, in contrast to all other types of 
statistics, more than one sovereign country compiles detailed information on the same phenomenon, in 
this case the movement of goods from one country to another. The detailed figures presented by a coun- 
try in regard to its foreign trade are, therefore, subject to searching comparison with similar detailed 
figures released by another country on the same transactions. Adequate trade statistics for such multiple- 
country use will face major conflicts of interest with domestic uses. Since many of these and other 
conflicts may turn out to be irreconcilable it may be that there will have to be a substantial increase in 
the use of dual compilation procedures designed to fulfill conflicting needs for information. 


Accuracy of Foreign Trade Statistics. Oscar MORGENSTERN. 


Accuracy of Foreign Trade Statistics can be estimated by comparing records for the same transac- 
tion by exporting and importing countries. This reveals for all periods very large differences in pairwise 
trade both for values and quantitiesaDifficulties in classification of commodities, transportation costs, 
tariffs can explain only part of these differences. Even if successfully enumerated, the various factors 
work jointly upon the finfil number record and their contributions are not known individually at each 
instant. Hence substantial corrections are not feasible. An exhaustive enumeration of factors does not 
make foreign trade data, when they have the properties listed by these factors, suitable for в fine- 
structured theory of international trade using such concepts as terms of trade or for monthly com- 
parisons with exchange rates, interest rates, etc. 

The most striking example is offered by gold, a clearly defined, easily recognizable commodity ,of 
high value relative to quantity, costs of shipment aad usually accorded exceptional care in handling. 
Yet statistics of gold movement, pairwise compared, reveal disastrous differences whether 1900, 1907 or 
1928, 1935 are taken as samples. This holds true for monthly as well as cumulative yearly data. 

There appears to be little hope ever to reconstruct series for the past on gold useful for most pur- 
Poses of economic theory. Present information can only be correlated with the discussion of broad tend- 
encies in economic life. The same is true, a. fortiori, for the far more involved statistics of other com- 
modities and the large categories into which they are put. These sober facts should lead to the estab- 
lishment of a program for future data on foreign trade with known errors, perhaps on a sampling base. 


On Respondent-Nonrespondent Differences Observed in ће Pittsburgh Morbidity Surveys. DANIEL 
G. Новуття, North'Üarolina State College. 


In morbidity surveys in particular, and other surveys as well, it is not uncommon to find a field 
procedure in use which permits the interviewer to obtain information required оз all members of each 


k 
SUMMARIES OF PAPERS 351 


In the absence of an opportunity to measure the response bias directly, data collected on illnesses 
experienced in the month prior to interview in two surveys (carried out a year apart with a probability 
sample of approximately 3000 households located in the Arsenal Health District of Pittsburgh) were 
subjected to internal study in an attempt to characterize the rather large differences in the rates ob- 
served for respondents and nonrespondents. The data were examined first to determine the relationship, 
if any, of these differences to age, sex and size of household. Almost in general the data yielded no indi- 
cation of a relationship between the observed respondent-nonrespondent differences and these variables 
whether for all illness or specific categories of illness. ‘There was slight evidence of a greater respondent- 
nonrespondent difference (i) for males than females for noninfectious conditions and (ii) for those in the 
15-34 year age group versus older age groups for all types of illness. "This latter observation was ac- 
counted for by a strong relatiopship between respondent-nonrespondent differences and age for female 
conditions associated with pregnancy and childbirth. For these conditions the respondent-nonre- 
spondent differences can be considered actual rather than due to any reporting bias. The respondent- 
nonrespondent prevalence rate differences ‘observed for the specific chronic conditions of arthritis and 
heart disease also were found to bê unrelated to age, вех and size of household. 

Examination of the respondent-nonrespondent differences for illnesses still in progress on the date 


ference may be an actual fact. This observation together with the consistency of the differences with 
age, sex and size of household requires further verification from morbidity studies conducted else- 
where. They also require (pending substantiation) investigation of the feasibility of a scheme for 
morbidity surveys whereby in a relatively small random segment of the sample households the respond- 
ent is selected at random from those eligible. The respondent-nonregpondent. differences in these house- 
holds should provide an estimate of the reporting bias portion of the differences observed in the re- 
maining households. This suggestion presumes that the ‘actual and bias portions of the differences do not 
vary in a compensating fashion to yield total differences unrelated to age, sex and size of household 


Questionnaire Design and Related Methodological Problems in the Canadian Sickness Survey. 
Ronzgr Konn, Dominion Bureau of Statistics, Ottawa. 


Problems of design are somewhat different for questionnaires and schedules, although many re- 
quirements apply to both. The basic document in the Canadian Sickness Survey was an interview 
schedule. Principles of design are discussed in three main categories: 1. contents, 2. wording, and 3. 
layout and format. The solution for particular problems will always depend on the type of informant and 
enumerator to whom the questions are directed. 

The contents are dictated by the objective of the Survey and the tabulation program. They should 
be kept to the minimum required. If after the beginning of the survey’ the need arises to add supple- 
mentary questions, these should not be allowed to jeopardize thg original ones and enumeratgrs should 
be fully informed regarding their purpose and meaning. The wording must be such as to elicit the same 
answer in the same sitüation from a variety of people. It must be adapted фо those of the prospective 
informants and enumerators who know and understand least about the survey. It must be easily under- 
standable and yet concise. Instructions must also be drafted with great care ‘and some machinery must 
be provided for prompt and uniform explanation ‘and supplementation of the instructions if required 
during the survey. Layout and format must be such as to fatilitate interviewing, recording, checking, 
editing, and coding. Protesting of schedules will reveal shortcomings before the actual survey gets under 
way. Qualification and training of enumeratoré'are important and so are methods to solicit and maintain 
the co-operation of the informants. 


The Future of Railroad Shares. Prennz R. Brerer, Heyden Stone Ф Co. 


Dring the past decade the rails have suffered from three major factors: (1) Delays by regulatory 
authorities in adjusting freight and passenger rates to higher costs, particularly following sizable wage 
increases. There have been many such delays since World War II. However, the “adjustment lag period’ 
has consistently been reduced over recent years, and may be almost entirely eliminated through passage 
of legislation, (2) The inflexibility of agarge percentage of railroad costs, with the result that historically 
earnings have declined sharply when gross revenues contracted. Notwithstandi i 
qd of railroads had achieved a flexible cost position before the roll ent ‹ 

lemonstrated better expense controls than exhibited by many industries. (3) A lowered institutional ap- 
Praisel resulting {тода the collapse in railroad earnings and in railroad equities both in 1929-1932, and= 
Again in 1938, collapses not duplicated since those two periods. Today most investment railroad equities 
are selling at from only four to five times ogtimated 1953 earnings, and railroad menare EP Mer! 
that individual properties аге in Фе best physical condition in their hi and that operating effi- 
ciency of most railroads has been restored following the rapid increase in costs immediately following 

by t 
5 


e в 


352 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


The Changing Stock Market. SipxEy LURIE, Paine, Webber, Jackson & Curtis. 
There are a number of inter-related factors which have an important bearing on the outlook for 


Trends in Monetary Structures in 1954, M. Durrow Morznovse, Brown Bros., Harriman & Co. 

In the nonbank long-term market; the supply of funds could well exceed the private demand by 
some $4 billion. In the short-term bank market the supply of funds will be governed by Federal Reserve 
Policy, but there should be a npgative net private demand of some $2} billion. The Treasury should be 
able, therefore, to achieve measurable progress in extending the maturity of the national debt without 


The decline in private demand for bank credit must be replaced by Government credit to avoid the 
deflationary pressure of а decrease in money supply. It is evident, therefore, that the bank market not 
only can but should absorb an amount of Government securities equivalent to the estimated deficit for 


second place, friction, in the form of delayed reactions to the lure of high profits, is the condition 
needed to induce innovations; and there is enough friction in most competitive industries to call forth 


S > 


SUMMARIES OF PAPERS 353 


a constant stream of new products and technological improvements. Some of the outstanding new 
products of recent decades, like the automobile and radio, were introduced into competitive surround- 
ings. Monopoly makes possible, though it does not ensure, extraordinary friction; and therefore it 
increases the likelihood that risky and costly innovations will be made. Large size combined with 
monopoly, a combination that is not inevitable, encourages the most risky and costly innovations. We 
now know very little quantitatively about the respective roles of competition, monopoly, and big busi- 
ness as promoters of economic progress. If discussion is to lead to intelligent changes in publie policy, 
we must improve our understanding of these roles together with our understanding of how organiza- 
tional forms affect other important values of our democratic and libertarian tradition. 


Validation of Morbidity Survey Data by Comparison with Medical Records. Nepra B. Вктлюс, Cali- 
fornia State Department of Public Health. 

A household sample survey of morbidity in the general population was conducted in San Jose by 
the California Department of Public Health in the spring of 1952. The survey was designed to test 
several methods of measuring morbidity and to study the completeness and accuracy of the information 
obtained. For validation purposes, Medical records were collected from a number of sources. With re- 
spect to hospitalized illness, it was possible to study the net error in reporting. Admission rates based 
on household survey reports did not differ significantly from those based on hospital records for the 

*same population. Similarly, days of hospitalization per person per year and average length of stay per 
period of hospitalization were measured accurately from household survey data. Distributions of ad- 
missions from household survey reports of hospitalization by month of admission, length of stay, and 
diagnosis were similar to the distributions obtained from hospital records. Whether or not surgery was 
performed was reported accurately in the household survey, but the description of the surgical procedure 
‘was not as precise as that obtained from hospital records. This study shows that reports of hospitaliza- 
tion obtained in a household sample survey are sufficiently accurate to be used for many purposes in 
lieu of hospital record data. 


Hospital Morbidity Reporting—Experiences and Findings of a Pilot Project. I. Methodological Experi- 
ences. Cart L, Еннлврт, New York City Department of Health. 

The Hospital Morbidity Reporting Project had two major aims: to test a method for collecting 
and tabulating data regarding patients discharged from hospitals in the city, and to analyze a meaning- 
ful body of the data collected. Both general and special hospitals, operated under municipal and other 
auspices, participated. Data covering the complete hospital system operated by the city are available 
for analysis. Reports submitted for each patient discharged dùring the project period included the 
usual demographic descriptions plus medical diagnoses, data on operative interventions and length of 
hospital stay, The facts, including diagnostic and operative data, were reported in code or in self-coding 
form, eliminating the usual expensive central processing step. The reported diagnostic code (Standard 
Nomenclature of Diseases and Operations) was mechanically trahslated into a statistical grouping of 
diagnoses (International Statistical Classification of Diseases, Injuries and Caues of Death) for analysis. 
The punching of digits of the Standard Nomenclature code was limited to those necessary for 180 classi- 
fication. A condensation of the ISC was also designed for summary tabulation of hospital diagnoses 
tignificant in frequency in the New York City area. The methodological experiences and analytic pro- 
cedures of this pilot project provide necessary background on which to base plans for routine collection 
of hospital morbidity data in New York City. 


Hospital Morbidity Reporting—Experiences and Findings of a Pilot Project. П. Significance of Findings. 

Manta Frannxun. New York City Department of Hospitals. 

Medical diagnoses together with demographic data on hospital patients can contribute to morbidity 
statistics needed by public health administration. In conjunction with information on duration. and 
Outcome of hospital care, these morbidity data have significance for hospital administration and medical 
care planning. To date, most hospital statistics cover all patients disregarding their diagnoses. The 
analyses of the Project data showed that the composition of the patient load by age and diagnosis of 14 
municipal general hospitals differed substantially. These differences explain many deviations in the data 
TN ‘average length of stay,” “net deaéh rate,” etc. if computed in the traditional unspecific way. 

tis shown how disease-specific data would permit more meaningful evaluation pf hospital services and 
rovide more useful bases for estimating personnel needs, Project data supplied disease-epecific informa- 
he on chronic illness, Many chronic disease patients stay only shortly in general hospitals. Short-stay 

ospitalizations—for epecial diagnostic, therapeutic or rehabilitative measures or for care during acute 
Sracerbations—are elements of a long-term medical regime of many of these patients. Cece 
terization of integrated medical programs for,chronio illness requires current disease-specific informa- 
Cn on hospital care to these patients. ` e 


354 AMERICAN STATISTICAL ASSOCIATIOW JOURNAL, JUNE 1954 


Statistical Determination of Tolerances in Rocket Development. Ерутх L. Crow, U. S. Naval Ordnance 

Test Station. 

The paper considers the problem of determining the tolerances necessary for the components of a 
rocket in order that it have a prescribed dispersion. The establishment of the relation between departure 
from nominal construction and the resulting deviation of the rocket from the ideal trajectory is the 
main concern of this study, but two other requirements and their utilization are also discussed briefly: 
the probability distribution of the departures under manufacture within any given tolerances, and the 
relation between any particular tolerance and the cost of attaining it. To determine with satisfactory 
precision the relation between the deviation in trajectory and departure in construction of a particular 
rocket, an unusually large departure was purposely introduced into 50 rockets, which were then fired 
along with a control group. Firing was repeated with the departure doubled, and several types of de 
parture were considered. In general the departure affected not only the mean coordinate of the trajec- 
tory but also the variances. A maximum likelihood estimate of the magnitude of the deviation was not 
practically attainable. An unbiased estimate of the mean square deviation was obtained by the method 
of moments, and its variance derived and investigated for various cases. ' 


The Up-and-Down Method with Small Samples. J, L. Норокв, Јв., University of California (Berkeley) _ 
The main content of this paper is a report on computations made to determine the actual per 

formance of some estimates for the mean dosage parameter и, based on up-and-down series of length 10° 

or less. It is found that the Dixon-Mood formula for the asymptotic variance is reasonably reliable even 

in samples as small as 5 to 10. Thus, restriction (b) is not necessary. As a consequence of this fact, the 

design of up-and-down experiments may be altered, so that several independent series are run simul- 

taneously, without serious loss оћассигасу. In this way, restriction (a) can be considerably reduced. 

Finally, the possibility of conducting an experiment with several short series run in parallel introduces 

a flexibility into the design, which permits us to take advantage of special features of some experimental 

situations with a still further increase in efficiency. 


Use of Statistics in Engineering in France. Anpné G. Laurent, University of Chicago. 

In France, statistical applications in engineering are chiefly concerned with quality control with 
stress on: a) teaching, evidenced by creation of a training center in statistical methods for engineers at 
the University of Paris; b) publications, evidenced by books by Dumas, Laurent, Mothes, and а 
periodical, Revue de Statistique Appliquée, c) applications to industry, evidenced by applications in min- 
ing, mechanical, metallurgical, and textile industries. Difficulties arise from lack of financial support, 
poor reputation of statistics among industrialists, and the gap between theory and practice. Further de- 
velopments may be expected along several directions: a) widening the fields of application; b) shifts 
of emphasis to descriptive and praxeological aspects as well as upon inference; с) more widespread uso 
of available but little applied methods. 


An Example of a Fractional Replication in a Bearing Abrasive Wear Test..F. R. Окт, Priore and 
Жїплллм J. Kommers, 

‘The purpose of this study was to develop an accelerated test for determining the relative wear re- 
inces of plastic bearing materials. First it was necessary to find the effect of the test conditions 0n 
bearing wear. Physical considerations and limitations played the usual important role in the testing 
schedule, Six variables, each at two levels were ENIE aoe as a quarter-fractional. Three bearing ma 


Gasoline Mileage in Winter Day-to-Day Use. Davin Frazer and Ray Decker. 


Research executives of a petroleum company suggested that mil. of motor gasolines in northern 
winter “home to work” Wriving might differ importantly because ol differeiota in нш losses, or In 
warm-up and misfiring characteristics. Such differences, if they really existed and were of important Si 
might suggest how the company could gain a competitive advantage. Therefore mileages of 15 gasolint* 
7 commercial and 8 experimental, were compared in a. complex statistical experiment on 15 employe? 
owned and driven care during the winter driving season. Each car in the course of the experiment ran 
7 of the 15 gasolines each for one week, the gasolines Dejng assigned to cars according to a statisti 


SUMMARIES OF NUN т 355 


plan called a “15 by 7 Youden square." All cars were driven normally, and gasoline consumption re- 
corded for each gasoline on each car was the difference between gasoline added and gasoline drained at 
the end of the week. 

‘When the experiment was complete statistical analysis, which eliminated all inherent “car to car” 
differences in gasoline mileage and much of the “week to week” variation within each car, revealed no 
significant differences among the gasolines. Further, the 8 experimental gasolines were во compounded. 
that they constituted a 23 factorial which, when analyzed, showed with improved power that variation 
of vapor pressure, olefin content, or polyolefin content within reasonable limits had no significant effects 
оп winter gasoline mileage. Inspection of results showed that real differences among the gasolines large 
enough to be important commercially would almost certainly have been reported as significant in this 
experiment, and therefore it was concluded that gasoline composition within reasonable limits has no 
commercially important effect dh winter gasoline mileage. 


Operations Research as a Science. Јонм B. LATHROP, Arthur D. Little, Inc., Cambridge, Mass. 

During the last few years, the term “operations research" has been applied to a growing body of 
investigative activity, directed at problems of decision and management in many areas of business, 
industry, and government. It has been described ns the application of the scientific method in new 

«areas of decision, with the broadening of the range of such application of such magnitude as to require 
the appellation of a new branch of science. Operations research is certainly closely related to other 
fields of science and engineering, such as systems engineering, industrial engineering, market research, 
eto., and is often concerned with problems of interest to statisticians. These relationships can be illus- 
trated by pointing out some of the general characteristics typical of operations research, and the role 
played by statistics in some examples. 

As in science, the primary objective of operations research is to understand, not to act. When an 
operation is understood, the action required to improve the process is often fairly evident. The scientific 
method, by a combination of quantitative hypothesis, observation, and controlled experiment provides 
this understanding by means of relatively simple models to describe complex situations. Rather than 
reasoning from the facts to the mechanism, as in some forms of statistical analysis, the operations re- 
search scientist sets up assumed models in order to deduce phenomena to check against observed facts. 
An excellent and well known example of this method was Newton's explanation of the apparently un- 
related phenomena of planetary motion and objects falling on the earth, by the simple unifying concept 
of gravity, In a study to reveal the causes of corrosion in paper mill equipment, the number of possible 
factors—chemical, physical and metallurgical—was very larges as was the amount of observed data 
from what amounted to an uncontrolled experiment. Although variance analysis was useful in de- 
termining the relative importance of various classifications of corrosion, gtatistical analysis alone would 
Teveal no causes. It was necessary to develop a model of corrosion from first principles, using the physical 
sciences, in order to obtain a useful understanding of the situation. " 

Other models have revealed the relationships between sales promotion effort and sales, between. 
Production schedules aid production costs, between warehouse stocking ptocedures and service to 
customers, between salesmen's compensation plans and profits, and во on. In some, statistical methods 
played an important part, in others no part at all. Information theory, queuing theory, servomechanism 
theory, symbolic logic or simple punched card reproductions of an operation have all provided the basis 
for operations research models. To summarize, operations research is а branch of scientific research 
concerned more with reaching an understanding of an operation than with the relations among the 
numbers describing the operation. " 


Effect of Current Operating Experience on the Realization of Investment Plans. Juan BRONFENBREN- 

NER Овоскетт. 

Comparison of the planned with the actual fixed capital outlays of manufacturing firms, as ro- 
Parted to the Department of Commerce and the Securities and Exchange Commission, has indicated 
that individual firms’ deviations from annual investment programs are subject fo large PP ty 

\ybothesis is tested that some significant part of the variance can be explained by concurrent fluctus 
fons in the sales, profits, and liquidity qf the investing firms. These variables may influence investment 
decisions either by affecting the expected return from proposed investment Gp) this feeb fio саву от 
Y affecting the ability to finance desired expenditures (in the third case). The ata studied cover about 
firms in the mild recession year of 1949. Information on actual and anticipated investment and 

lea was supplemented by income and balance sheet data relating to profits and liquidity. 
«S, [Ве extent and fiature of the effect on investment decisions of the explanatory variables OT 
ee to depend on whether the proposed investment is primarily for expansion or primarily for 
dernization. Direct information was not ayfable, but a rough attempt шай to separate Bri 
which might be, from those which probably were not, interested in expansio: investment. For the 

ee \; 


є є 


356 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 1 


first group, deviations from investment programs were found to be correlated positively and very r _ 


nificantly with deviations of actual from anticipated sales, while for the second group the effect of sales 
deviations was not significant. 

Changes in profit rates, as compared with the previous year, were also found to have a significant 
effect on the investment deviations of “expansionary” firms, but this was less marked than the influence. 
of sales deviations and impossible to separate from it because of high correlation between the two 
explanatory variables. For the “non-expansionary” group, tha effect of profit movements was not sig- 


nificant, and (except for cases of very large profit declines) the relationship appeared to be negativo, - 


possibly indicating a tendency for modernization expenditures to vary inversely with profit movements 
arising from changes in labor and materials costs, The effect of changes in liquidity on investment devia- 
tions is difficult to ascertain, However, in those cases where the observed liquidity movement is clearly. 
something more than a result of the deviation from investment program, it appears to be a significant 
factor in motivating the deviation. 


Expectations, Plans and Capital Expenditures: A Synthesis of Ех Post and Ex Ante Data. ЕовЕвт 

Exswzn, Northwestern University. 

Ez ante and ex post data relating to sales and capacity changes, profits, capital expenditures and 
other variables have been derived from replies in McGraw-Hill questionnaire surveys and from inde- 
pendently obtained corporate balance sheet and income statement information. Analysis is focussed on. 
the determinants of investment and, in particular, on evidence of operation of the acceleration prin- 
ciple. An “acceleration component" of investment is isolated and observed. Its absolute magnitude is 
found to be positively related to both current and lagged actual changes in sales and to expected changes 
in sales. And a larger “acceleration component” is associated with a larger total of investment. This 
evidence seems to fit well assumptions of recent economic theory of Hicks and Harrod which suggests 
а great significance for even a multilagged and quantitatively limited role of the acceleration principle, 
Investment is also found to be positively related to the rate of profit. 

Expectations of sales increases are related to very great or long-term growth, but current sales 
changes are generally related negatively or not at all to expectations of immediately future changes. 
The directions of expected changes, but not their magnitude, can be matched to corresponding actual 
changes in sales. Discrepancies between planned and realized investment are found to be a decreasing 
function of size of firm, with very large firms appearing slower to adjust in a period of rising expend 
tures. Sales changes different from those expected, particularly in the case of firms expecting increases, 
are found to be a factor in the difference between investment plans and realizations. 


Teaching Statistics to Executives. W, ALLEN WALLIS. 


My discussion is based on experience in teaching statistics in the Executive Program at the School 
of Busitiess at the University of Chisago. The Program’s students are experienced executives. The ob- 
jectives of the course are to teach: (1) avoidance of common logical mistakes in reasoning from data; 
(2) techniques for organising numerical data into coinprehensible form; (3) appreciation of the problems 
of allowing for chance and for unanticipated effects in designing investigations and in drawing condlu- 
sions from them; (4) a few rough-and-ready methods of analysis; (5) an appreciation of the potential 
uses of sampling; (6) recognition of a statistical problem when encountered; and (7) respect for the in- 
telleotual discipline underlying modern statistics. 

_ Although the course operates mainly through lectures, the atmosphere is informal; class members 
raise questions and make many contributions, and time is left for post-lecture discussion. Numerous 
realistic examples are used, not only from business but from other fields. While we try to bring out thè 
general principle illustrated by an example, a systematic exposition of statistical principles would hav® 
fairly low priority. What the students want, and what we are trying to give them, is something which 
will enable them to understand and generalize the experiences they have had so they may better cop? 
with the broader experiences they expect to have. An important teaching device is the term paper. The 
students submit written descriptions of a statistical problem (preferably connected with their own 
business) and of the methods they propose to use for attacking it. This is returned to them with com- 
mente and suggestions, and must be approved by about the middle of the term. The final papers, sub- 
mitted at the end of the term, are read carefully, and many stüdenta receive letters from the instructor 
commenting on them i detail. A statistics graduate student attends the lectures and writes det 
notes for distribution not later than the next class meeting. Regular textbooks are also used, and some 
exercises and quizzes are given. 

We have come to feel that the kind of course developed for the Executive Pfogram is more suitable 
on the campus than the kind of introductory course traditionally given. Consequently the courses it 
the Business School and in various other departments are being,unified along the lines developed P 
the Executive Program, ' ; 


SUMMARIES OF PAPERS 357 


Intercensal Needs for Small Area Data—By a Local Planning Agency. Harun G. Loomer, Philadelphia 

City Planning Commission. 

‘The statistical information that is needed by a local planning agency during the intercensal periods 
may be divided into two classes: that which is available or can be developed from local sources, and that 
which requires the special resources and authorities of a federal agency for collection and development, 
The former should be fully exploited before federal agencies are called upon for assistance. In the 
present situation of curtailed budgets, federal agencies should not undertake a multiplicity of new field 
survey projects, even on а sample basis, but should concentrate on fully exploiting and coordinating 
material that already is being collected by several such agencies. If all of the usual complete enumera- 
tions of the Census and all of the continuous records that are now being maintained by such agencies as 
B.LS., Federal Reserve Bank, Old Age Assistance, ete. were properly brought together and integrated, 
there would be sufficient indices fo indicate the trends from one decade to the next. Federal-local co- 
operative research should be further developed, possibly through the expansion of the facilities of 
regional or local branch offices of the federal agencies, Trained and experienced personnel from these 
offices could render valuable service intplanning and directing local research. As a permanent continuing 
agency, these offices could bridge the gaps between intermittent or one-time surveys by local agencies 
and coordinate all into a comprehensive whole for the area. The authority and reputation of a federal 


‚ agency (such as the Census) is needed to assure the success of some types of studies. Costs should be 


shared by participating agencies. Uniform sample surveys are of little value to local planners, Either 
the sample is too small to indicate purely local trends or the procedure is too standardized to meet spe- 
сібе local needs. A highly flexible cooperative arrangement would best serve the needs of local planning 
agencies, ^ 


Business Uses of Intercensal Data. W. R. SIMMONS. 

Excepting local establishments, it is safe to venture that there are few business firms which do not 
make many vital planning and operational decisions based directly or indirectly on the findings of the 
Census. Generally taken for granted, like the mail and telephone service, Census reports become con- 
spicuous only when and if they are absent. Not the least of the considerable American economic progress 
has been due to more intelligent planning based on sound factual information, to which Census reports 
are indispensable. In establishing plant and store locations, in developing local, regional, and national 
sales potentials, in defining sales territories, in analyzing problems of distribution, in evaluating sales 
performance and in deciding where, when and how to improve future performance, this type of informa- 
tion becomes crucial. Any substantial reduction in this service, in the commendable interest of economy, 
Must run the risk of impairing the efficiency of our entire economy. 


Meeting the Needs for Small Area Intercensal Data. Conrap TAEUBER.* 

The major method of meeting needs for intercensal data for amall areas from Federal sotírces is 
through Special Censuses conducted at the request and expense of local areas. Nearly 200 such Censuses 
have already been conducted since 1950, primarily for the purpose of establisifing base figures for the 
Mie of tax funds. Although considerable interest in a quinquennial census of population about 
n has been expressed, the prospects for it do not seem bright because of the costs. Estimates of popu- 

tion change since the last census and projections into the future are prepared locally by many cities 
and counties. The Census Bureau has issued a report suggesting methods of estimating current popula- 
tions of small areas. School censuses are well established. but generally are not conducted in a manner 
to provide general purpose statistics. They represent a major resource for local area data which needs 
ег development. There is need also for further research into the utility of the wide variety of local 
sources which could be used as indicators of population change in individual localities. 


Industrial Production. Index. Сплутох Оюнмам, Lorman TRUEBLOOD, ARTHUR L. Bromma, M. H, 

Scuwanrz, Милом Moss, Perer M. Copy, Federal Reserve Board. 

The Board of Governors of the Federal Reserve System on December 18 released the basic postwar 
Tevision of the index of industrial production. The more obvious major features of the revision are an 
updating of the comparison base period from 1935-39 to 1947-49 and a change to the Standard Indus- 

Classification from the prewar Сеп в classification system. More substantive features relato to 
10, 1 dating of the weight year from 1987 to 1047, enlargement of the number Gf monthly serios from 
{00 to 175, substantial improvement of many old series, the development of independent annual indexes 

Which the monthly indexes are adjusted, and new seasonal factors for sll major groupe. 
revision во far Applice mainly to the period from 1947 to date. For the earlier period the Board 


benchmark ject, and has linked the old to th 
x changes developed by a joint Census-<Tederal Reserve project, ven E 
in January 1939 for the period back to 1919, Similar benchmarks and links Were made for minerals 


$ 


| 
`i т 
858 AMERICAN STATISTICAL ASSOCIATION JOURNAL, 


These interim revisions have been done at the level of the major divisions of the index in order 
tate comparisons with the more recent period. In a general way, changes in industrial activi 
1947 are shown to be similar by both the new and the old total indexes. In the first half of th 
both indexes indicate that activity was at a record level for the postwar period, about eighth 
& year ago. Both show that since midyear, output has been reduced fairly generally. For Oct 
new index was about 4 per cent below the highs which it established in May and July; the 
was down about 5 per cent from its peak reached in March. Both indexes show that industrial | 


tion in October was at about the same level as a year earlier. 


Demonstration Teaching of Statistics. R. E. WAGENHALS. 


You can interest people in the use of statistical quality control without the use of math 
dramatic factual demonstrations, With the use of a series of demonstration devices, people 
shown how ineffective 100% inspection and 10% sampling may be as compared to scientific or stal 
sampling. The repeated winner of a modified game of dice demonstrates he is not lucky but is j 
control chart principles. A small pin ball machine demonstrates how a control chart and the normal ¢ 
tribution is related. The advantage of having quality controlled products is demonstrated by five 
ferent colored groups of twenty tubes to show how assembly tolerances can be reduced. These de 
are demonstrated and the source of information regarding their construction will be available, | 


Use of Experiments in Engineering Statistics. Invia W, Burr, Purdue University. 


‘The paper presents convenient populations for measurements and binomial and Poisson 
tions. Then experiments to illustrate the following are discussed: sample frequency distribution, © 
charts, significance teats, estimation, analysis of variance, statistics of combinations, ace 
sampling, sequential analysis, linear correlation, and curve-fitting. A simple method of simulat 
normal bivariate population is presented. , 


Using the Experimental Approach in the Teaching of Statistics, EpwiN G. Otps, Carnegie Insti 
Technology. 
After a brief discussion of types of experimenta which might, be used, the paper discusses the 

of the experimental approach in connection with lectures by the instructor and laboratory work by 

student. The importance of using adequate time for orientation is emphasized. The paper closes 

а proposal that the opinions stated be put to test by means of a designed experiment. 


Problems in Experimenting with the Application of Statistical Techniques in Auditing. Jonn № 

Syracuse University. 

Sampling is used extensively in auditing. Before any attempts to experiment with the use of 
tistical techniques in auditing can ye made, one must formulate the purpose of sampling in auditi 
such terms that statistical models can be constructed either for the problem as a whole or for d 
parta of it. Once this has been done, the relative merits of various statistical inodels can then be 
gated, One basic approach to the purpose of sampling in auditing would be to ascertain whether 
ures in the financial statements are reasonably close to the figures which would be obtained if в 
examination of all transactions and records, including any necessary corrections, were made, At 
approach would place emphasis upon п examination of the basic company accounting policies and 


A Single-Sample Multiple Decision Procedure for Ranking Means of Normal Populations with | 
Variances, Вовквт Веснноғев, Cornell University. Й 
A Two-Sample Multiple Decision Procedure for Ranking Means of Normal Populations with a Ci 
Unknown Variance. Cxartes W. DUNNETT, Lederle Laboratories. 
A Sequential Multiple Decision Procedure for Ranking Means of Normal Populations with Ki 
Variances. Мплтом SoBEL, Cornell University. 
A multipl 


" 5 > 


| 

SUMMARIES OF n 359 
value, say 2%, of 5,41 that he is interested in detecting and (2) The smallest; acceptable value, say P, 
of the probability of achieving this goal when 4,,125*. Three multiple decision procedures to ac- 
complish this goal are given: (1) А single-sample procedure when the common variance is known. 
(2) A two-sample procedure when the common variance is unknown. (3) A sequential procedure when 
the common variance is known The observations Xi; are normally and independently distributed 
chance variables N(Xij| ni, «2» ({=1, 2, * * * , k; j=1, 2,* * ad. inf). We assume that the ш are un- 
known. The ranked д; are denoted by pp] Sup] S * ** Sup. It is not known which population is asso- 
ciated with n 1, 2, * * * , k). We define 5;; =p] —ut] 6.3 71, 2, * * * , K). 

Single-Sample Procedure. (1) Enter the appropriate table with k and the specified P, and obtain a 
constant h =A(k, P). (2) Set h equal to A/ NA where А —*/c and c is the known population standard. 
deviation, Solve this equation for N. (3) Take N observations from each population where N is the 
smallest integer greater than or equal to the solution obtained in (2). (4) Calculate the Ё sample sums. 
(5) Select the population which yielded the largest sample sum as the population having the largest 
population mean. 

Two-Sample Procedure. (1) Take a first sample of Ns observations from each of the k populations. 
Any integer №, will satisfy the requiréments of the problem. (2) Calculate 


DRM тле M 
з= У D (Xj TMe Xij 


n i= j=l 


в 


which is an unbiased estimate of о? having 
n = КМ — 1) 

degrees of freedom. (3) Enter the appropriate table with n =k(Ns—1),and the specified P, and obtain 
а constant h=h(n, k, P). (4) Take a second sample of N—Ne observations from each of the k popula- 
tions where N ів the smallest integer equal to or greater than the larger of No and 2ss#(h/6%)?. (5) Cal- 
culate the Ё over-all sample sums Үрү = Хуу ({ = 1,2, k). (6) Select the population which yielded 
the largest over-all sample sum as the population having the largest population mean. 

Sequential Procedure. Let the ranked sums based on m observations be denoted by 


Ym < Yim < *** € Урт 
and let 
Dim = Урт — Yum = 1,2, ++ — 1) 


Tus the mth stage of experimentation take an observation from each of the k populations and com- 
put 


Wp = e8 Dine p qr Danh ap Lus 4 0 Diet nlt, 


(à) If Wa (1 —P)/P, stop experimentation and choose the population which yielded the largest 
fim, Fium, as the one having the largest population mean, (b) If Wm > [1 —P) /P], take another ob- 
tervation from each of thé Б populations and сотре у. Continue in thi# manner until the rule 
calls for stopping." 


An Optimum Slippage Test for the Variance of К Normal Populations. Donato Тях, Stanford 
University, ° 


m a solution is given to the problem of deciding if several normal populations with unknown means 
mild inre have equal variances, or if not, which population has the largest variance. Under some 
ү» ictions on the class of possible decision procedures this solution is shown to be optimum in the 
i ве that it maximizes the probability of making the correct decision when all the variances are equal 
ET P Sn is Jarger than the rest. The optimum procedure is given as follows: select population 
sq M [Pia si! » La and decide that all the variances are equal if sut [Es 92 SL. Wo denote by 
Man Unbiased estimate of the variance from the ith population, and M represente the index of the pop- 

ion having the largest sample variance. The constant La is determined by the restriction that if 


| ji P 
“о variances are equal then that decision should be made with probability 1—a. 


The Use of Statistical Techniques in the Rging of Accounts Receivable, В. М. Crest, Carnegie In- 
stitute of Technology. e 
The aging of aco 1 ind i rtion of the accounts 
ounts receivable in most enterprises is done by sampling а portion 0! n 
Teeeivable. On the basis of the aging an allowance for uncollectibles is estimated. The study described 


in x d i 
this paper had two primary objectives: (1) To analyze the sample design being used for selecting ac- 
and determining the optimum 


Ма to bo aged with a vi i 
a view toward improving the method of selection 
‘ample size, (2) To examine the plan for.selecting a€counta already aged to be test-checked by the auditor 


ec Ep 


360 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 19% 


with a view toward installing an appropriate acceptance sampling scheme. The procedure followed was 
to examine а past sample drawn from a large metropolitan department store. The sample was first 
analyzed for randomness by the use of Kendall's coefficient of rank correlation. There was no significant 
correlation found between the alphabetical rank of the accounts and the rank of the account age, It was 
decided, therefore, that it would be meaningful to compute the precision with which the past sample 
estimated the allowance for bad debts. The precision as computed from the sample of 15,000 accounts 
was deemed too tight by the public accountants working on the study, New precision and reliability 
requirements were specified and a new sample size was computed on the assumption of an unrestricted 
random sample design. The required sample size was reduced from 15,000 to 1,700. The sample design 
specifies that the accounts will be drawn systematically and the sampling error will be computed by the 
Tukey plan. 

Previous practice in test-checking the agings done by store personnel had been to select 10% of 
the accounts aged. The public accountant then decided whether or not to accept the total accounta 
aged on the basis of the test-check. The process of test-checking was one to which sequential analysis 
was applicable. After the accountant had defined a good and a bad lot and specified the risks to be 
taken, a sequential sampling plan was determined. The plan wad truncated at three times the average 
sample number for lots with a per cent defective equal to the slope of the decision lines, The maximum — 
number that would need to be sampled, therefore, is 376 and the average sample number is 126, In the 
authors’ view, this study establishes the possibility of adapting statistical methods to accounting and 
auditing problems involving a high volume of homogeneous data. It is important to note that the sub- 
ject of this study and, therefore, its conclusions do not relate to an area of audit in which results are 
applied directly to the decision of giving or withholding an opinion on financial statements. 

і 


New Experimental Designs for Paired Observations. W. S. Connor, National Bureau of Standards. 

‘There are many experimental situations in which only two objects can be observed at a time under 
homogeneous conditions, as for example in а recent experiment at the National Bureau of Standards 
which involved the comparison of several types of spark plugs in two cylinder gasoline engines. In 
these cases experimental designa with two plots per block are needed. If all possible pairs of objects are 
observed, the statistical analysis of the observations is simple. However, if the number of pairs is large, 
this procedure may be expensive and may provide more precise results than are needed. Thus it often 
is desirable to observe only part of the pairs. The choice of this subset of pairs must be made with 
care if the statistical analysis is to remain simple. Such a subset is the “two-group arrangement,” which 
was suggested by the traditional experimental procedure of comparing several new objects with one or 
more standards. The two-group arrargement consists of dividing » objects or treatments into two groups 
of m and n treatments (v=m-+n), and of pairing every treatment from one group with every treatment 
from the other group. No ather pairs are formed. The analysis appropriate to this arrangement is do- 
scribed in detail. 

D 


» 
Some Aspects of Sequential Experimentation. S. Munro. 

Out of practical nécessity a sizable amount df industrial and military development is being 000" 
ducted sequentially, even to the extreme of a sequence of samples of one. There is, however, aside from 
a handful of exploratory and superficial papers, no body of mathematical theory which serves as 4 
for the design of such experimentation, Since this experimentation is performed to make use of all the 
available information, in the colloquial sense, at every stage in the sequence, the usual statistical by- 
pothesis of independence is not valid. Also, unlike elassical design problems, this experimentation is 
devoted to estimation rather than tests of hypotheses. Many different types of practical problems will 
be discussed and an attempt will be made to show that the transition from one experiment to the next 
is a statistical problem, properly subject to mathematical study. 


Information Theory and Prediction. Max. A. Woopsvry. 

A study is made of the usefulness of information theory as an aid in making and evaluating pre 
dictions. Clearly, predictions are made on the basis of information available to the predictors. TWO 
principal problems are faced in making a prediction: First, the selection of the prediction, and second, 
evaluating the prediction in the light of what happens. Not^ that a prediction involves a selection 
information available and a transformation. Naturally there is interaction between the two problems 
since one would change the method of prediction appropriately if the verification methods chang® 
Here we use the measure of information proposed by Shannon and Wiener. A value is attached 9 
probability distributions similar to entropy or negative entropy and the amount of information 
measured by the change in entropy due to the additional information, In practice this involves the Ut 
of Bayes theorem. The principal predictions studied in the paper are meteorological forecasts. An 9 
project has been set up to evaluate the methods propó&ed and with the help of the U.S. Weather Bure®t 


ARIES OF PAPERS 361 


the U.8.A.F. certain studies have been made. Here the verification evaluation consists of computing 
n “information ratio.” This is the amount of information available in the forecast relevant to the 
menon predicted divided by the amount necessary for a “perfect” forecast. Techniques for dis- 
irrelevant information have been investigated and the use of discriminant and factor analytic 
ods have been undertaken. 


tion of Length of Hospital Stay from Discharge Data. Сілғғовр A. Влснвлсн, The Johns Hopkins 
l'ospital. 
_ This paper considers the relationship between the two distributions of patients’ length-of-hospital- 
Жау that are commonly tabulated and used in hospital administration. These аге: 
1. The distribution of the lengths of stay in the hospital of patients who are discharged during a 
interval, e 
2. The distribution of lengths of stay-to-date, of patients who are in the hospital at a given time, 


ital at a given time may be regarded as а length-biased sample of the hospital's patients; the ex- 
mean value of the stay-to-date of such a sample is given by (07/2u)+(/2), where p and о are 
tion parameters of the distribution of lengths of stay of patients discharged during a time interval. 

An analogy is drawn between a general population and a hospital patient population. The distribu- 
n of stays of patients discharged during a time interval is the analogue of the distribution of ages at 
th of а general population. The distribution of stays-to-date of pátients in the hospital at a time is 
logous to the age distribution of the general population at a census. If the stationary population 


data on discharged patients. 


metric Estimation of Survivorship. D. J. Davis, The Rand Corporation. 
„~ This paper deals with the application of statistical methods to the development of failure distribu- 

on hypotheses for complex mechanical systems and to the testing,of these hypotheses. The two failure 
theses considered are: Exponential in which non-failed systems are assumed to exhibit an equal 
obability of failure during equal time intervals independent of the previous operating time. Normal in 
л the probability of failure per unit time for unfailed aystems increases with use in such a manner 


tistical analysis is applied to failure data for a variety of systems involving people and/or electrical, 

ШШ, and mechanical equipment. The raw failure data afe arranged in frequency distributions 

ue’ are compared to hypothetical distributions derived from either an exponential or а normal iheors 

failure, whichever is appropriate. The fit bebween the observed and hypothetical distribution is 
by means of the chi-square test. The chi-square tests indicate no significant =й 

AUS. theory and observation in all but one of the 34 data samples examined. The 34 failure distributions | 
‘ominantly exponential rather than normal. 


The Demand for Citrus Products. Georon M. Kuaners, University of California, and RrcmAmp J. 
LL Foorn, Agricultural Marketing Service. 
Tho paper deals with the problems of utilizing the information collected weekly by the Market 
teh Corporation of America from a national panel of consumers on purchases and prices paid for 
5 citrus items and canned juices 8 obtain estimates of demand elasticities. The advantages of 
ting data which exhibit variation both over households and over time are first discussed with refer- 
to a simple model of purchase behavior. A comprehensive study is then outlined which involves 
te regression analyses for summer and winter seasons extending over a recent period of 38 months 
dh of 10 rural-urboa geographic strata. The unit of observation is a family-month and the variables 
monthly purchases by panel families of each of six citrus producte, prices paid for these and for 
T other products, family income, family вілесап index of availability of frozen orange juice, an: 


th within season, The main problém encountered in proceeding with the stidy is the absence of in- 


cé А 


362 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 194 


formation on prices confronting the non-buying families. Probably not less than 70 per cent of the rè 
quired prices for each product will have to be estimated each month. Fragmentary results from several 
pilot studies in progress are described These studies are designed in part to test the usability of panel 
data and in part to provide some indications on the basis of which a reasonable choice of a procedure of 
estimating missing prices could be made. 


An Analysis of the Demand for Meat. Ayers Brinser, Harry ALLISON, and CHARLES Zwrck, Harvard 

University. 

‘The purpose of this study of the demand for meat, fish and poultry meat items was twofold. In the 
first instance, it was an attempt to define certain factors affecting demand in addition to price and in- - 
come and to determine the order of their relevance. The other factors include family composition, edu- 
cation, age, time available for meal preparation by the housewife, and.ethnic group. The second objec- _ 
tive was to test the effectiveness of a consumer panel to supply the data for such analysis. This paper | 
is a progress report of the study outlining the design of the research and the method of data collection 
with a preliminary analysis of the consumer panel technique as it. yas applied. A full evaluation of this 
aspect to the study must wait until all of the data collected have been analyzed. The results of the study 
up to this point suggests that the consumer panel technique is a workable method for collecting data at 
the household level with limited research resources. It remains for the continuing analysis to show 
whether these data can be used to provide effective answers to such questions as whether there are dif-” 
ferential price and income elasticities among the various subgroups in a population. On the evidence of 
preliminary estimates it would seem that the form and substance of the data collected will support 
‘useful quantitative analysis. 


Demand Analysis from the M.S.C. Consumer Panel. G. G. Ослскечвовн, Michigan State College 
The М.5.С. Consumer Panel consists of about 250 families who report each week in considerable 
detail on their food purchases, Each family reports the quantity, price, and expenditure for each food | 
item purchased, About 500 food items are included, and this includes all food. Each family reports ita 
income for the week. The sample area to date has been the city of Lansing, Michigan. The project was 
designed to continue for 10 years. The objectives of the project are to determine price, income, and ого 
elasticities, as well as purchase patterns for food. The research was undertaken to help fill the need for 
cross-sectional analyses over time and for analysis of data of the short-period type. Examples of current 
research include use of single equation least squares methods to derive price elasticities for beef, pork, 
poultry, fish, cold cuts, individual cuts of meat (steaks, roast, hamburger, etc.), and some fats and oils: 
Cross-sectional analysis, using total annual purchases for individual families, has been used to derive 
income expenditure elasticities for food eaten away from home, food purchased for home preparation, 
and for some individual food’ products. It is concluded that the consumer purchase panel method has 
considerable potential as a method gf securing data for demand analysis and related studies. 


The Flow of Net Cash Savings through Life Insurance Companies. Harris LOEWY. 

The flow of net cash savings through life insurance companies is made up of the savings of policy 
holders (internal savings) and governments, corporate debtors, and mortgagees (external savings). 
Internal savings equals companies’ net income, and ів the net of income flows such as first year's al 
renewal premiums, interest income, ahd expenditures flows such as payments to policyholders 
expenses. The growing stability of internal savings is the result of the increasing relative importance of 
such contractual income flows as renewal premiums%and interest income. Payments to policyholders 
will also stabilize as the result of the rapid growth of group and term insurance in force in which policy 
holders’ equity is small. As equity declines in the future, policyholders’ dissaving, a feature of the Great 
Depression, will also be relatively less. " 

Individual decisions to save are mainly incorporated in the first year's premium flow which shifts 
rapidly with changes in the level of income and employment, The renewal premium and interest incom 
flows are relatively insensitive to income changes and are increasing in stability because of the contrat- 
tual element they incorporate. Total income is stabilizing because of this contractual element and be 
cause the volatile first year's premium flow is becoming a declining proportion of total income. The e- 
penditures flow is dominated by payments to policyholders. I% the 30's the flow of payments to policy- 
holders was strongly fluenced by lapses and surrenders. If the decline in policyholders’ equity 0005 
tinues as a result of the growth of term and group insurance, swings in this payments flow should be of 
smaller depth and amplitude, The net result of these developments should be growing stability in the 
annual flow of internal savings or net cash available to finance capital formation. The flows of ех 
savings, amortization payments, are measured for 1947-1950. Added to internal savings, they tho" 
partes (new money) the life insurance industry made available in those years by юш 


1 
SUMMARIES OF PAPERS 363 


Current Problems in Measuring Moneyflows. Morris A. COPELAND 

Moneyflows measurements constitute a set of interlocking social accounts for the economy—sec- 
tor sources and uses of funds statements, This system of social accounts is an extension of the national 
income and product system to show inflows and outflows for nine (instead of five) separate sectors, and 
to show a variety of types of transaction that do not appear in the national income and product accounts, 
particularly financial transactions. The way in which the economy must be divided into sectors for this 
purpose calls not only for replacing the business sector of the national income and product accounts 
with at least three financial and two nonfinancial business sectors; but several national income sectors 
must also be redefined. A critical current problem involved in the sectoring of the economy that is 
examined in some detail centers around the contrast between personal saving (disposable personal in- 
come minus personal consumption expenditure) and the net financial use of funds by households (in- 
crease in household cash and pórtfolios minus increase in household debts). The extent to which other 
regular statistical compilations (as well as the national income estimates) are currently used in making 
annual estimates of sector moneyflows is emphasized. Several problems involved in putting the various 
sources of information together intofmoneyflow accounts are outlined. An important but as yet unsolved 
problem is that of developing current quarterly estimates of moneyflows. The nature of this problem 
is considered in relation to the monthly and quarterly figures now available. 


D 
Estimation of the Interval Rate in Actuarial Calculations: A Critique of the Person-Years Concept. 

JossrR Berkson, Mayo Clinic. 

Each individual in a group for which a follow-up study is being made is observed from some de- 
fined origin of time, as for instance the time of receiving medical «treatment. The observations may 
continue to some predesignated closing date C or be terminated by a random process, The problem is to 
estimate the probability of survival to time Ту. Broadly, two methods can be used, (1) If the probability 
of survival to T; is given by some smooth survivorship function Ру —F(Tj, 6, 0,* **) where 61, 0: =+ 
Tepresent parameters, then with appropriate observations available, the 6’s may be estimated by using 
some method of “fitting” the function, such as maximum likelihood, minimum X', least squares, etc. 
(2) If the function F is not known, the actuarial method is used. In this method the total period Ty is 
subdivided into intervals small enough go that within each interval the survivorship can be considered 
linear, The conditional probability of surviving to the end of the interval [z, (2-1) ], having survived 
to the beginning, is pz, and if this is estimated for each interval, the probability of surviving to time Tj 
's given by P; = [Tr "pz. The essential problem, then, is to estimate pz. Two estimators are discussed, 
(1) “Person-years” estimate, given by g, =4/P.Y.E , where d if ће number of deaths observed in the 
interval and the “person-years exposed,” P.Y.E. =Nz—D¥(1—Tiz), where Nz is the number living at the 
beginning of the interval and Tiz is the fraction of an interval for which each of W “withdrawals” among 


the Nz was to be observed. (2) Maximum likelihood estimate, given by the solution of 
* 
L 
(l= Tiz) e 
e - N;-ZLI——— |, 
к “| fs 11 E 


where the summation is taken over L individuals last observed living in the interval. Various approxima- 
tions of each of the estimates are discussed and variances of tHe estimates are evaluated. It is pointed 
cut that the “person-years” estimate implies that it is possible to define a “withdrawal” W. For the 
Usual follow-up study a definition of W requires the imposition of a single “closing date” for all persons 
10 the study, This involves the disadvantages that it (1) necessitates a truncation of the data that 
1918058 available information, (2) makes clerical work more complicated than when the maximum 
oad estimate is used, (3) renders impossible calculation of survival rates for any series in which 
than 100% of the cases are traced to the closing date. 


The Problem of Within Family Contagion. Улам R. Garrex, University of California (Berkeley). 

The presence of contagion, or direct person-to-person transmission, in a disease process is often 
Ае from the fact that the secondary attack rate is high compared with the primary attack rate, 
is, “ample is given to show that no sucht inferences regarding contagion can be made from the compar- 
12% of these rates, because of the possibility of different intensities of exposufe in different families. 
onp cad, the observed variables are taken to be the times during the period of observation at which cases 
(уЗ Мп а family, assuming families to be independent. A model is constructed which admits either 
G no contagion, (2)épositive contagion (in which a case makes subsequent cases more ae 
fo negativo contagion (in which a case confers a degree of immunity on the rest of the family). The 
and Density of the times of occurrenge referredtio above is given under the hypothesis of no contagion 

Under the alternative of contagion (positive or negative) of a special type termed E 

е 


се e 


364 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


A uniformly most powerful test of the hypothesis (1) against either (2) alone or (3) alone is found, 
зв well as a uniformly most powerful unbiased test against both (2) and (3) together. All of these are 
insensitive to variations in the intensity of exposure, The first two moments of the test statistic were 
found as an aid to getting an approximate region of rejection when large numbers of families are in- 
volved. 


Consistency of Estimators under a Specialized Bioassay Procedure. WrLLIAM Е. TAYLOR. 

The consistency properties of various types of estimates are considered in this paper, with especial 
emphasis on maximum likelihood and minimum chi-square estimates. A general theorem is presented 
stating necessary and sufficient conditions for consistency. It is assumed that the effects of a drug are 
to be evaluated by giving it in varying doses to several subjects and observing the number of these sub- 
jects who respond in a given way. Let the dosage levels be zi, 21,---» т, and the number of subjects 
receiving the ith dose, nj. The proportion of subjects responding to the ith dose is called аг. A further 
assumption is that the probability, pj, of a subject responding to dose z; is some function /(0|2i) =/i(), 
where 0 is а parameter to be estimated. For several of the commqnly chosen functions, f, and several 
methods of estimation, it has been shown that the estimates of 0 so obtained are regular best asymptot- 
ically normal (RBAN) in the sense of Neyman, and thus have identical asymptotic properties. 

There is more than one type of limiting situation which leads to asymptotic properties, however, 
and the fact that one such situation results in desirable properties may have little to do with the prop- 
erties associated with other limiting situations. The results most frequently used are based on the cage in 
which the total number of subjects is N =; у. N is allowed to increase by increasing each nj, keeping 
the n/N and the s fixed. When this is the limiting situation considered, maximum likelihood, minimum 
X^, logit, and probit estimates are ‘all RBAN When N— %, not by increasing each n; but by taking on 
more dosage levels, i.e., by increasing s, the above can no longer be said. This is shown with regard to 
the consistency of the estimates, assuming the n; are all < М < ©. While maximum likelihood estimates 
can be shown, under certain conditions, to be consistent, under these same conditions minimum Xt ones 
are not consistent unless additional restrictions are imposed. 


Use of Census Tracts in Study of Changing Residential Patterns in Metropolitan Areas. Gzonox Duc- 
GAR, 


For decision on issues of policy and theory research is needed on residential localization. Studies 
should be extended to encompass entire standard metropolitan areas and, in the search for a valid 
typology, applied with identical statistical techniques to several metropolitan areas and to several 
periods of residential construction. The 1950 Census of Housing permits comparison between standard 
metropolitan areas and between parts of areas. By comparison with the 1940 Census and by referent? 
to data on year built new and older housing can be compared, and areas can be distinguished in terms 
of theirrates of housing growth. Grogs techniques for study of metropolitan areas are examined, center- 
ing on the 91 areas with a city of 100,000 or more and contrasting housing in their core cities and ro- 
maining area, and housing built 1940-50 and earlier. Correction of results fcr differences in extent of 
land area and use of the new urbanized area concept are found valuable. Techniques for more intensive 
comparative study of growth periods, of entire metropolitan areas, and of more central and more 
Peripheral portions are examined, using census tracts and, in the untracted portions, other small sta- 
tistical areas. Techniques include mappihg of growth rates, growth concentration, and housing character” 
istics, statistical analysis of relations between these (in ways relevant to theory) and identification of 
small areas where growth rate or concentration sugiesta case study While lack of cross tabulations 
Prevents direct comparisons between the housing in the several tracts by period built and characteristics 
the general patterns of co-variation of growth rate and of housing characteristic can be examined a 
reflected in tract totals. It is suggested, however, that the type of distinction which has been drawn 
in the 1950 census between urbanized area and standard metropolitan area be drawn, also, between the 
urbanized and non-urbanized portions of tracts. 


Theory of Behavior. Epwanp W. BARANKIN. 
See Econometrica, Vol. 21, No. 3, July, 1953, p. 474. 


Uses of Census Tractsby Housing People. Оовотну 8. MONTGOMERY, 

Tract use gained impetus in Philadelphia as a result of interest in the two Real Property Survey 
of the 1930's, as well as the 1930 Census. The Philadelphia Housing Association, the first housing gro? 
in the city, was among the first to use the tract for analysis and recording. Аг ап educational, Com 
munity Chest agency, we find the tract a valuable tool in educational work as well as in our (айбе 
ing activities, At the outset, the census tract helped {э tell the story of the relationship of housing f 

E : 5 


q 


! 


SUMMARIES OF PAPERS 365 


health, to delinquency, to dependency. "Today our Association uses census data on a tract basis to inter- 
pret neighborhood housing conditions in connection with the housing tours which are an essential part 
of our educational program. 

‘The technical uses of the tract are numerous, and most of them common to other communities. 
But I would like to describe two uses that may be unique, The first is a use that continues year after 
year as part of our tabulation and analysis of city permits covering new dwelling construction, conver- 
sions, and demolitions. These data have been regularly collected by our Association since 1923, and have 
enabled us to estimate changes in the dwelling supply throughout the city in intercensal periods with a 
reasonable degree of accuracy. Since 1945 these data have been tracted, which has added the important 
element of precise location to the previously tabulated information on type of new housing, selling price 
and lot size. The tracting of conversion data is of particular value since it has permitted analysis of the 
significant, and usually neglected changes, in the use of structures, changes that are frequently associ- 
ated with the start of neighborhood blight. It may be mentioned here that the tracting of the permit. 
data became feasible when an up-to-date street and house number directory was published for Philadel- 
phia. Prior to the directory, tracting was almost impossible. 

Tract data enabled us to discern and map the patterns in the spatial movements of Negro house- 
holds in the city, and to gauge the amount of concentration and dispersion during the last census 
decade, Most of the influx of the decade had been absorbed in the traditional areas of Negro occupancy 

‘which had become more Negro in the process. Within these areas, however, there had been substantial 
shifts with the North Central Philadelphia area becoming the major center of concentration. Analysis, 
of the tract data also showed that, contrary to general opinion, Negroes are not finding homes through- 
out the city; and that where expansion did take place, it was in or adjacent to areas in which Negroes 
already constituted a large part of the population. While over-all cerisus figures indicated that housing 
occupied by Negroes was on the average of much poorer quality than housing occupied by white families, 
analysis of tract data revealed that there is no inevitable correlation between the race of the occupants 
and the quality of the housing. A comparison of two neighborhoods showed that while one, an area of 
high Negro concentration and severe overcrowding, has become increasingly blighted in the last two 
decades, the other, alao an area of high Negro concentration but low density of population has shown. 
4n improvement in housing quality in the course of the racial change. This study, which could not have 
been undertaken without tract data, will provide the Commission on Human Relations with factual 
information upon which to base its educational program to end discrimination and segregation. It will 
ie affect other aspects of the city’s official policy concerning the enforcement of minimum housing 
WB. 
; ‹ 
Uses of Small Area Census Data in New York City. Fuorence S. Соттввш, Welfare and Health Council 

of New York City. e 

The decade 1940-1950 brought so many shifts in the population of New York City thet many 
organizations are studying the 1950 population and housing data for the 2563 census tracts, or the 352 
larger health areas which are composed of tracts. Dne of the major uses by City departments, social 
ides churches and similar organizations is to estimate the needs of areas within the boroughs for 
eee Projects, schools, health programs, case work services, recreation centers, programs to prevent 
linquency, churches, ete. and to determine the location of facilities to provide these services. The 
City Planning Department utilizes small area data in connestior? with its master plan, zoning ordinances 
And its other responsibilities and for population forecaste. Several major research projecta— studies of the 
‘ged, mental health, medical care and the teen-sxe narcotics problem—have used the data for selecting 
‘amples and/or for studying population characteristics and environmental factors. Utilities, commercial 
and Savings banks, insurance companies, newspapers, construction firms and other companies use 
Population and housing data to estimate loads or sales, to plan districte, for mortgage appraisals, to 
раа the need for housing or branch banks, ete. Manufacturers, advertising agencies; radio stations, 

research organizations indicate dependence on block and tract data for sample marketing and public 

ion surveys, 

orep trial publications: To supplement the tract data published by the Bureau of the Census, six 
aianisations purchased a set of the IBM summary tract cards on the 1050 population. These E 
Bout dd the City Department of Mealth and the Research Bureau of the mu en aede 
and Howe Publish population data for health areas. The cards also provided deta for the Populatie 
from thee, volume of the “New York Market Analysis,” companion of the volume Sio NOM 
York D» 1947 Census of Business, Both volumes of this publication by the New ena nee 2и 
and gi ewe and New York Times cover 116 retail trade districts (aggregations of tracts) within the City 

21 suburban counties. The Research Bureau of the Welfare and Health Council published for census 

ets and health areas: “Population of Puerto Rican Birth or Parentage, New York City: 1950." 
© с є 


а 
ec с 


a 


366 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 105 


Probability Distributions of Group Organization Theory. Leo Karz, Michigan State College, 


A functioning group of N individuals is represented abstractly as a multilinear, graded, directed 
graph on N points. Between each pair of points (individuals), the complete relationship is assumed to 
be analyzable into a sequence of categories, in each of which the strengths of the bonds in both directions 
are measurable in some scale. The hope for economy in description of the group rests on the possibility 
that the bulk of the information in these infinite-dimensioned vectors is contained in relatively few (per- 
haps, only one) of the components. A single component (one facet of the group organization), then, is 
represented as a directed graph and, if the relationship is all-or-none, as a simple linear directed graph. 
‘Two fundamental results provide an almost complete probability distribution theory for the classical 
sociometrie problems. A thesis by James H. Powell of Michigan State College will contain an analysis 
of the structure of the sample space of sociometric investigation whidh makes possible the immediate, 
though not simple, expression of most of the pertinent probability distributions in terms of the numbers 
of graphs satisfying certain restrictions. A second result by Katz and Powell (submitted to Proceedings 
of the American Mathematical Society) gives explicitly the number of directed graphs satisfying a specific 
set of local restrictions А 

The principal classical problem not solvable as above is that of the distribution of mutual or 
reciprocated choices. However, an iterative procedure of application of the previous results produces the 
required distribution for this case. Applications of the general method include the measurement of con- 
centration of choice, tendency toward reciprocation, development of leaders, etc. Finally, preliminary 
study of the bilinear graphs corresponding to two-dimensional group functions indicates that any at- 
tempt to handle these mathematically would involve invention of some new mathematics, On the other 
hand, this study indicates some new uses of multiple-level sociometric techniques in the field of social 
psychology, ^ 


Tracts in Analysis of Worker Mobility. MEnEDITE B. Givens, New York State Department of Labor. 


One of the uses of census tracts is to provide a basis for aggregation of data into special-purpose area 
groupings. Census tracts were recently utilized in this way by the research staff of the Division of Em- 
ployment of the New York State Department of Labor, in a study of where people work and live in 
New York City and its metropolitan environs. The study was undertaken for administrative purposes, 
though ite results are proving to be of considerable interest to all who are interested in worker mobility 
and the pattern of commuting in the metropolitan community. It involved comparisons, by small areas 
within the Metropolitan Area, of the place of work of employed persons as revealed by the Division's 
own employment data, compiled for employers of twelve or more workers, with the place of residence 
of workers living in these same areas as shown by the Federal Census of Population. Such comparisons 
of (1) employment at their work place with (2) Census data on employed persons counted at their homes 
are indicative of the “balance” or “imbalance” between the number of persons working within an area 
and theonumber of persons living in {hat area. The comparison does not show where the workers within 
а given area come from or where those who live within the area work. For this purpose a special question 
in the Census schedule wpuld be required, 15 a 

For purposes of analyzing and mapping data, the intra-State Metropolitan Area was divided into 
57 districts. These districts represented combinations of census tracts in those localities which have been 
tracted, namely New York City and Westchester. For Rockland, Nassau and Suffolk, which are still 
untracted areas, other approaches were ased for subdividing the counties, For New York City, the basio 
data were first prepared on a census tract basis and then combined into 210 “conversion districts" —& 
tentative grouping of census tracts developed with tle assistance of a committee named by the local 
chapter of the American Statistical Association. This grouping was designed for possible use in obtaining 
tabulations of Census materials on a small area basis, Contiguous “conversion districts” with similarities 
in employment were then combined into 32 broader districts. For Westchester, the only other tracted 
county in the area, the 150 census tracts were combined into nine districts which conform to the de- 
velopmental districts delineated by the Westchester County Department of Planning. " 

To obtain information on employment at place of work on a census tract basis, for combination 
into the broader districts, it was first necessary to code the data on insured employment according t? 
census tract. To obtain employment data according to place of residence for combinations of census 
tracts, the Division purchased from the Bureau of the Census:IBM summary cards showing, by tract, 
employment characteristics of the employed labor force for New York City and Westchester. The New 
York City purchase was made jointly with several other local groups using combinations of tract ae 
for their own purposes, The task of coding the Division's employment records by tract proved difficul 
because of the lack of an up-to-date directory of census tracts for the City. — . k 

This project required the solicitation of one-time breakdowns of employment at each place of wor 
from multi-establishment employers. The project demonstrates the feasibility of using census tracts 4$ 
an approach to small-area analysis of data from other than Censlis sources in comparison with Censu* 


! 


SUMMARIES OF PAPERS 367 


data. The tract is a useful common denominator in devising larger standard areas for statistical tabula- 
tion and analysis in a metropolitan area. 


Use of Census Tracts for Business Analysis. Perry Н. MYERS. 

The main problem in the use of census tracts and other census data for business analysis is simply 
ignorance of even interested businessmen regarding census material. An informal survey suggests that 
about half of business executives think a census tract is a pamphlet issued by the Bureau of Census. 
Census tract data are essential in several major marketing developments: (1) In tracing the differentia- 
tion of the urban market and particularly the development of a new type suburban group, In this con- 
nection, it should be noted that higher incomes, the building boom, more marriages and more children 
have led to an accelerated movement of younger, middle income couples into the suburbs and, at the 
same time, concentration of lower income groups within the central parts of the larger cities. To trace 
this development, which has far reaching effects for both retailer and manufacturer, requires detailed 
study of tract data, not only in terms of population but also in terms of income, family characteristics, 
home ownership, etc. (2) The development of suburban retailing, particularly large department store 
plans for shopping centers, has required detailed analysis of local data to determine the population and 
characteristics of the areas to be served by these new suburban outlets. (3) Tract data are essential in 
consumer sampling, both in setting up a random sample and also in projecting the results to the total 

* population of a city or to other cities. In general, it may be said that increasing purchasing power and 
the wider distribution of discretionary income have led to greater differentiation in the consumer market 
and a greater need of tracing this differentiation in terms of where people live, and who they are, through 
the use of the census tract. 


The Relation of Census Tracts to the General Census Program. Ковевт W. BURGESS. 


The activities of the Census Bureau, the methods used and the subject matters covered can be 
partitioned in various ways, many of them significant in directing the enterprise or proposing changes. 
One of the fundamental splits which effects both the determination of what shall be done and the way 
in which the details of various projects shall be developed is between the nation and the small area as 
the focus of interest. Typically an economist values statistics for the information they give of the nation 
ша whole and of the major есопотіс divisions thereof. He is also interested in the changes over rela- 
tively long periods of time. The market researcher, however, and other specialists in planning economic 
and social activities, need information as to the population in relatively small local areas. They want 
to have an adequate basis for determining where a store or a school or a hospital can best be located 
and for judging whether one of these institutions is really meeting the present and future needs of its 
neighborhood. For such purposes sampling procedures do not fill the bill, and the small area statistics 
Provided by a reasonably recent complete census are necessary. ° 

The needs for small area statistics have thus an important bearing on determining the proper period 
between censuses covering a particular field. These needs also influEnce decisions as to the form in which 
census results should be stated, so as to meet the needs of users and encourage greater use of all material. 
It has become clear as workers in various fields apply scientific methods increasingly to the key factors 
of their problems that these needs for local statistics deserve careful attention. The Census Bureau, 
therefore, can and does cooperate with local agencies and individuals in developing census tracts and 
stimulating their use. The Census Bureau, however, leaves the functions of analyzing tract statistics and 
campaigning for educational or social projects that such analysis might suggest to individuals like sub- 
sequent speakers on this program or the organizations which they represent. Noteworthy gains are being 
Madoin the establishment of census tracts so that within a few months 46 standard metropolitan areas 
n be entirely divided into tracts as compared with only 11 at the time of the 1950 Census. With con- 
m interest on the part of local committees, it is a reasonable goal that we have census tracts for all 

A's developed for the 1950 Population Census. 


How the Automobile Industry Utilizes the Census Tract in Market Determination, Fraprvanp F 
Mauser, Wayne University. 
ma automobile manufacturers are fairly consistent in their belief that the census tract is а cdm 
in tlatetial tool, of value in gathering distribution intelligence. Marketing strategy based upon this 
to ee determines placement of dealerships. Companies consider proper pl«cement of dealerships 
(0 De vital in their attempts to achieve maximum sales for their products. Two of the methods employed 
MS pris using the census tract as the basic working framework d the standard amid 
Approach and the central city approach. Superimposing of selected data upon area таз! es 
tele to bring areas of poor e EN focus. The type of data used for addition to the 
aa Sus tract framework varies depending upon the need and makeup on the area studied. There may be 
many as 10 or 12 separate maps drawn up for a single area analysis. Statistics! specifics are presented 


д е 
‹ с 


© 


368 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


on a three part basis: (1) the ideal situation—what the company would want if it could write the ticket 
exactly as it wanted, (2) the present situation as it now is, and (3) the recommended proposal in terms 
of specifics—the answer to the question of what should be done now. Under this system the ideal ar- 
rangement is used as the starting point. The situation as it now stands, is then related to the ideal. 
This brings the problems into focus. Changes then become evident and improvements are made gradu- 
ally in the direction of the ideal because the ideal or goal is known. 


Further Generalization of Neyman's Distributions. Grorrrey BEALL. 

It is known (Beall and Rescia, Biometrics 9: 354-386) that $(t) «exp. тл! (е! —1)"/(n49)! 
is the characteristic function of Neyman's contagious frequency distributions for 0 €n— «=. Investiga- 
tion now shows that n may properly assume negative values of any magnitude and thus produce an 
even more extensive family of frequency distributions. The distributions for n= —1 and n=—2 are 
intimately related to those where n is nonnegative in that, for example, the third moment steadily de- 
creases from n infinite till n= —2. The distributions, however, change abruptly at n = —3; then the 
third moment becomes very great. As n decreases further the third moment again decreases and ap- 
proaches the value obtaining for n very great. Presumably the Neyman distributions may now be ex- 
pected to fit a great variety of data. It is known that with n small and positive they fit phenomena differ- 
ing greatly from binomial situations. Current investigations suggest that much data that might be fitted 
by negative binomials can be as well fitted by the contagious distribution with n— œ. It may be hoped 
tbat the present extension will make them even more adequate. These further generalizations have sug- 
gested that the handling of this whole system of contagious distributions may be greatly facilitated by a 
transformation from the parameters m: and m: to b=4(n-+-2)mi/(n-+1) and с =2m:/(n +2). 


The Time-interval Approach to the Problem of Contagion. Grace E. Bares, Mount Holyoke College. 

The particular approach to the problem of contagion discussed in this paper is an outgrowth of a 
suggested treatment outlined in the last section of the paper “Contributions to the Theory of Acoident 
Proneness, Part II”, by Jerzy Neyman and Grace E. Bates, U. of Cal. Press, 1952. Starting with a 
stochastic model in which the probability of an individual's incurring an accident in a given time inter- 
val depends not only on the length of this interval but also, possibly, on the number of previous ao- 
cidenta sustained—i.e. possible contagion—the random variables considered are the time intervals 
from the start of the period of observation to the occurrence of each accident for an individual incurring 
exactly n accidents in the interval, given that the individual had already sustained exactly m accidents 
at the start of the period of observation. In this paper, the distribution of these time intervals was con- 
sidered only in the case of contagion of a very specialized type, termed linear contagion. Specifically, 
this restriction on the type of contagion provides, for example, that each additional increase in the num- 
ber of accidents previously süstained (from 3 to 4, or from 10 to 11, for example) yields the same per- 
centago increase or decrease in that individual's chances of avoiding accidents in the period of observa- 
tion. Under this restrictive condition the distribution of time intervals is independent of the number of 
previous accidents sustained and the time-interval data needed for testing the hypothesis of no contagion 
does not require this information. 

For this model, uniformly most powerful tests of the hypothesis of no contagion against either of the 
one-sided alternatives and a uniformly most powerful unbiassed test for the two-sided alternative case 
are obtained. The statistic used for these tests ів the grand mean of all the time intervals for all the 
individuals. The exact power function can be given explicit form but is во cumbersome to apply that @ 
method of approximating the power is outlined. Э 


Contemporary Topics in Statistical Physics. G. W. Preston, Philco Corporation. 

The history of the development of the classical theory of statistical mechanics is briefly traced. It 
is recalled that the behavior of macroscopic quantities of matter can be correctly described by the Us 
of the Laws of Motion and the theory of probability. Not only the condition for thermodynamic equ- 
librium, but also the equations for the rate of processes, can be given by probabilistic statements. The 
fundamental statistical attributes of matter are the relative independence of the fundamental units 
of matter and their very frequent, though proportionately brief and violent interactions. The fact that 
the fundamental units of matter are nearly independent grestly simplifies the distribution problems 
whereas their frequeñt interactions guarantee that the system will assume a very large number of 
microscopic configurations during the course of any thermodynamic measurement. The fundamental 
Tesulis of the recent theory of fluctuations are shown to suggest a possible approach to a statisti 
mechanical theory of non-equilibrium thermodynamics. Finally, the necessity of including in the € 
pression for the physical entropy of a system the entropy of information obtained about the system БУ 
experimentation is shown to imply an equivalence between the second law of thermodynamics and 02 
tain basic theorems in the theory of statistical estimation. р 


SUMMARIES OF PAPERS 369 


Statistical Problems in Physics. Martin J. BERGER. 

This paper reviews certain aspects of the relationship between physics and statistics. Some reasons. 
are brought out why physicists in the past have paid comparatively little attention to problems of sta- 
tistical inference. The point is made that this neglect is unjustifiable and deplorable. Various branches of 
statistics are then examined from the point of view of their importance to physics. Finally, some typical 
problems of statistical inference are described that arise in the interpretation of physical experiments. 
It is shown that there is a large class of problems in physics to which the existing statistical methods 
can be readily applied with great advantage, while others of a more specialized nature would require 
further development of statistical theory along novel lines. The hope is expressed that physicists will be- 
gin to make greater use of existing theory, and that statisticians will look into new problems raised in 
physics to the mutual advantage of both sciences 


The Statistician in a Research and Development Laboratory. В, B. Day, U. S. Naval Engineering Ez- 
periment Station, Annapolis, Maryland. 

Using a particular research anddevelopment laboratory as a case history, the steps followed to 
put a statistician to work therein are outlined in detail. Three major points are developed: (a) the selling 
job required at the different levels with illustrative material presented; (b) the organizational set-up in. 
the Laboratory for most effective work; and (c) the statistician on the job—qualifications, working 
relations, and responsibilities, Some details of the internal operation of a Statistical Office are indicated. 
The paper concludes with a discussion of what the future holds for the statistician in the Laboratory. 


Survey of the Theory of Finite Sampling. Јоверн Е. DALY, U. S. Bureau of the Census. 

Not many years ago the design of sample surveys depended heavily for its success on the reputa- 
tions of its practitioners, who strove mightily to “validate” their samples with the aid of collateral in- 
formation. Today, notwithstanding some popular impressions to the contrary, finite population sam- 
pling is one of the most respectable and thoroughly practical applications of mathematical statistics. The 
Dresent quite satisfactory state of the theory of sampling finite populations is based on two main char- 
Acteristics of the problems which the theory is designed to solve. In the first place, it is possible in many 
of these problems to devise methods of sample selection and estimation such that the resulting sample 
estimates can be expected to obey the laws of probability. The amount of variability in the estimates 
arising from the fact that they are based on a sample rather than on a complete count can therefore be 
measured by the same techniques and with the same precision that one can measure the variability in 
Büccessive drawings from a table of randon numbers. In the seconé place, it is frequently possible to de- 
Vise formulas which will serve as good approximations to the way in which the costs of projected sample 
ЗШҮбув will vary with the size of sample and with the manner of selection of sampling units, This makes 
it possible to define objectively the notion of an “efficient” or an “optimum” survey design, namely one 
Which minimizes the variability subject to fixed total cost. 9 

Tmportant work has been done in the way of devising techniques of sample selection which are 
mure efficient than simple fandom sampling (e.g. selection with unequal probabilities, controlled random 
selection across strata, etc.) and on developing estimation formulas which make maximum use of avail- 
Able collateral information (e.g, regression estimates on one or more variables). One unsatisfactory aspect 
of the Present theory is that it refuses to evaluate sample designg which are not subject to the laws of 
Probability. Recent work on statistical decision theory based on the theory of games promises to shed 
ome light on the properties of probability sampling methods in relation to larger classes of sample de- 
i Including the selection of samples based on expert judgment. It is now known, for example, that 
m ®t certain conditions simple random sampling represents a minimax strategy. Further developments 

ng this line can be expected in the next few years. 


Кун" Sampling Concepts in Experimental Statistics. /Евомк CorxrreLD, National Institutes of 


ong L8 necessary to know the expectation of each of the mean squares in an analysis of variance in 
pend (o) to choose a Proper error term (b) to estimate components of variance. These expectations de- 
inea, 1, What was sampled. Usually this, dependence is expressed by denoting each observation as в 
dà Compound of fixed and random variables, making certain assumptions abou} these components, 
ауа E the expectations, Different assumptions lead to different expectations, however, and in 
ctim tore complex than the one-way classification it is not always possible to choose unequiv- 
‘ong the different possible assumptions. Siu t 

the 1.201018] way out Gf this difficulty is provided by finite sampling concepts. This is illustrated in 
© OA classification, for which one can assume а population of elements classified into Е rows 
Within 0 1008 with N elements in each of the RO cells. A sample of r rows, с calumns and n elements 
the rc cells in the sample is taken. If we assume the sampling is such that each row has the 


c 


(Л 


| 


„ 
870 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


same probability of selection (r/R), that each column has probability of selection c/C, and each element 
n/N, that the probabilities of sampling any two rows, columns, or elements within cells are respectively 
r(r—1)/R(R—1), с(с—1) /С(С —1) and n(n —1) /N(N —1), and that rows, columns and elements within 
cells are sampled independently, the expectations can be derived without further (and more disputable) 
assumptions by elementary (but laborious) algebra to be 


Sample mean square for Expectation 


N-n C-e 


Rows AE ТА. 


ст? + псор? 


N-n Віт 
n 


Columns oe N t а? + nro 
Interaction PI x orem 
Error са 


where oy? is the within cell mean square for the population of RCN elements, Naz? the interaction meau 
square for this population, NCor* the row mean square and Мос? the column mean square. The ex- 
pectations for Eisenhart's model I, model II, and mixed model follow as special cases. 


STATISTICAL ABSTRACTS 


All communications concerning this section should be addressed to the 
Abstracts Editor, Professor George E. Nicholson, Jr., Chairman of the De- 
partment of Statistics, University of North Carolina, Chapel Hill, North 


Carolina. 


Anis, А. A., and Lloyd, E. H., “On the 
range of partial sums of a finite number of 
independent normal variates,” Biometrika, 
40 (1953), 35-42. 

Given a random sample of n.s ( X1] from 
a normal population. The partial sums, 
S, Sa't’, are formed, where r-1, 
2,:++,n, S, Di- Xi. The average value 


Jof the range of the n partial sums is shown 


to be 4/277 Dy rz, R. L. ANDERSON, 
North Carolina. State College. 


Anscombe, F. J., “Sequential estimation,” 
Journal of the Royal Statistical Society, 15 
(1953), 1. 

The author reviews the literature on se- 
quential estimation, presents some ap- 
proximate methods for solving a number of 
Particular problems and discusses the pos- 
sible usefulness of sequential estimation 
procedures, 

A sequential estimation procedure is a 
Procedure in which the number of observa- 
tions is not fixed in advance, but depends, 
according to some definite rule, upon the 
Observations themselves. An example of 
Büch an estimation procedure, due to Hal- 
dane, is the estimation of a binomial pro- 


Portion » when it i i ЕЕ 
en it is desired that the stand- . sampling "plans, 4 T. 


ard error of such an estimate be roughly 
Proportional to p. The rule is to take ob- 
Seryations until the number of successes X 
qual to some prearranged number X 2:2. 
E an unbiased estimate of p is p 
ү -1/(N—1) where N is the totak 
ae of observations. An unbiased es- 
Ee of op? is p(1—p)/(N—2). Another 
imple is that of estimating the mean of a 
pormal population with unknown variance 
ya confidence interval of width d and con- 
E coefficient 1—ao. Stein has given a 
a © sampling procedure in which a first 
imple of fixed size no is taken and then a 
er sample of N— ms observations is 

| КЕ where N depends on the observations 
M first sample. Reference is made to 
А m Girshick et al. for obtaining an es- 
Eus of the unknown proportion p at the 
tion Arr quential test on a binomial popu- 
ul he author asserts that estimation 
m аз valid for fixed sample sizes are 
Ptotically valid for sequential sam- 


pling if the sample size is large. He develops 
a sequential estimation procedure for es- 
timating the mean of a normal population 
with unknown variance by a confidence 
interval of width d and confidence co- 
efficient 1—а. If t, is the normal deviate 
which has probability a/2 of being ex- 
ceeded, the rule is to stop taking observa- 
tions when s* the unbiased estimate of the 
variance is first less than or equal to 
122/412. Modifications of this гше are dis- 
cussed which reduce the error of the esti- 
mate. It is shown that the expected sample 
size corresponding to this procedure is ap- 
proximately 49%,2/l?+-1+ta/*. Similar re- 
sults are obtained for estimating the differ- 
ence between two means, and estimating the 
birth and death rates in a simple birth- 
death process. Finally it is conjectured that 
perhaps the main practical benefit to be 
derived from studying the properties of se- 
quential stopping rules is to assess the de- 
gree to which the usual fixed sample size 
estimation formulas are affected. GEORGE 
E. NicHoLsoN, JR., University of North 
Carolina. * 


Altman, Irving B., "Relationship between 
sample size and AOQL for attribute single 
rial Quality Con- 
trol (January 1954), 29-30. 

A chart showing the AOQL values for a 
large variety of single sampling plans is pre- 
sented as a function of sample size and ac- 
ceptance number. GERALD J. LIEBERMAN, 
Stanford University. 


Bailey, Norman T. J., “The total size of a 
general stochastic epidemic,” Biometrika, 
40 (1953), 177-85. 

The author presents a stochastic treat- 
ment of the problem of estimating the num- 
ber (w) who will become infected if one in- 
fected person is introduced into a group of 
n susceptible individuals? Diagrams are pre- 
sented of the probability of w being in- 
fected for n=10, 20 and 40 and p=n/4, 
n/2 and n, where p is the ratio between the 
rate of removing infected people from the 
group and the infection rate. The average 
w is also computed. The results are ex- 
tended to the distribution of w for house- 


371 ‘ : 


372 
holds, when n is small; formulas are given 
for n = 1(1)5. The author summarizes his 
results as follows: “The model used here is 
not adequate for diseases with short infec- 
tion periods, such as measles, but its ade- 
quacy for other infections requires testing.” 
R. L. Anpzrson, North Carolina State 
College. 


Bartlett, M. S., “Approximate confidence 
intervals,” Biometrika, 40 (1953), 12-19. 

The likelihood derivative, T'— 91/90, is 
used to derive approximate confidence in- 
tervals for various 0, assuming T is normal. 
An approximate correction for skewness 
(xı) of Т is introduced, using Ту= T--A(T* 
— I), where I is оТ). А is approximated 
as — кз/61%. Also if к= 0, a method is given 
to adjust for kurtosis. Three examples are 
presented. R. L. ANDERSON, North Caro- 
lina State College. 


Cohen, A. C., “Estimating parameters in 
truncated Pearson frequency distributions 
without resort to higher moments,” Bio- 
metrika, 40 (1953), 50-7. 

Using the method of moments, estimat- 
ing equations are obtained which involve 
only the first four sample moments in con- 
trast to the first six previously employed. 
The general case of doubly truncated four 
parameter distributions and special cases of 
doubly and singly truncated samples are 
worked out or their solutions indicated. It is 
pointed out that these equations can either 
be solved directly or by an iterative process. 
An example is worked out to illustrate the 
method and afford an evaluation of the 
technique. D. C. Htrsr, North Carolina 
State College. 


Cox, D. R., and W. L. Smith, “The super- 
position of several strictly petiodic se- 
quences of events,” Biometrika, 40 (1953), 
1-11. 

N sources, each producing events at 
regular intervals but with unequal periods, 
6; 1—1,2, * * * , N, are pooled. This pro- 
duces a sequence of unequal intervals. The 
frequency distribution for these intervals is 
derived. There is a point frequency, Q, for 
the largest of these intervals, which is the 
same as the smallest of the 0j. A measure 
of the interval variability, displayed in a 
variance-time curye, can be used to dis- 
tinguish the given sequence from a random. 
Sequence. Approximate procedures аге 
given for large N. 

i given a pooled sequence, a meth- 
od is available to estimate the number of 
sources, N. 

Three examples are given. This reviswer 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


could not check the results for Q. R.\L, 
ANDERSON, North Carolina State College. 


Craig, C. C., “Some remarks concerning the 
Lot Plot Plan," Industrial Quality Control 
(September 1953), 41. 

The author presents a non-technical eyal- 
uation of Dorian Shanin’s Lot Plot Plan, 
He discusses three main points: 

1. The misuse of lot plot for process con- 

trol. 

2. The severity of the operating charac- 

teristic curve of the Lot Plot. 

3. The dependence on histograms. GER- 

ALD J. LIEBERMAN, Stanford University, 


Craig, C. C., “Note on the use of fixed num- 
ber of defectives and variable sampling. 
sizes in sampling by attributes,” Indus- 
trial Quality Control (May 1953), 83-86. 

Under the assumption that samples are 
taken with replacements from small lots or 
that lots are large enough so samples taken 
without replacement will behave effectively 
as if they had been taken with replace- 
ments (infinite lots), the author examines 
the problem of attribute sampling where the 
random variable is the number of items 
sampled, », before a fixed number, m, of 
defectives are found. A control chart based 
on the random variables n/m or n is pre 
sented. Grratp J. LissrnMAN, Stanford 
University. 


Craig, C. C., “On the utilization of marked 
specimens in estimating populations of fly- 
ing insects,” Biometrika, 40 (1953), 170-76. 
An observer catches butterflies, marks 
them and then releases them. This is re- 
peated в times in a given area and the num- 
bers of butterflies caught one, two, +: 
times are recorded. Five methods are used 
to estimate the total number (n) of butter- 
flies in the area, three assuming a truncate 
Poisson distribution and two a special dis- 
tribution due to Stevens. Approximate vat 
iances are obtained for each estimate. 
Two worked out examples are included, 
plus the results of 14 other catches. The 
author uses x? to test the agreement 
tween actual and expected frequencies ani 
finds 15 significance probabilities >50% 
and the 16th >30%. R. L. ANDERSON 
North Carolina State College. 


Evans, D. A., “Experimental evidence oom 
cerning contagious distributions in 
ogy,” Biometrika, 40 (1953), 186-211. 

a The frequencies for a variety of counts 
tiade on English and American plant we 
cies, insects and larvae and moth eggs 


|TICAL ABSTRACTS 


| o three different theoretical con- 
| distributions: negative binomial, 
Aeppli and Neyman Type A. The 
atical formulae for fitting are in- 
For each distribution, two methods 
uting are compared: (i) use of sam- 
m and variance, (ii) use of sample 
and proportion of zeros. A method of 
ng between (i) and (ii) is presented. 
Neyman Type A seemed to be best for 
inglish plants, the negative binomial 
е insects and moth eggs, and none of 
ee seemed to be satisfactory for the 
ican plants. 
ment between actual gnd expected 
butions is tested by two statistics 
là advanced by Anscombe. The 
xt-test for frequency data is also used, 
reviewer notes a tendency to pool 
y classes on the tails. The Anscombe 
ilios seem to be more sensitive to aber- 
8 from theory than is the all-purpose 
„ R. L. Anpurson, North Carolina 
College. 


r, Walter D., “On a pooling problem 
the statistical decision viewpoint,” 
elrica, 21 (1953), 567-85. 
problem posed is that of finding a 
basis for grouping a set of K random 
bles into a set of G subgroups (GS К). 
cedure is outlined which places this 
em in the framework of statistical de- 
ion theory. With a proper specification of 
indom variable and with the explicit 
nition of additional assumptions and 
lions, the logical subgrouping is deter- 
ed from the observed sample data. 
е following formulation of the general, 
roblem is offered. A sample (zu. ***, 
ОЮ Consists of one observation drawn inde- 
dently at random from each of K dis- 
it “cells.” The random variable 2; has 
ectation 0;, variance c;*, and a measure 
portance ту. The 0; are unknown, the 
"and т; are known positive numbers. The 
Problem is to arrange the set of К cells 
mutually exclusive subgroups ac- 
to some partition P, and to choose 
sion. vector tp = (hp, * * * , tp), where 
p deno tes a decision number or “estimated 
teristic” of that group to which the 
‚18 assigned by partition P (t;p for cells 
signed to the same group wil be iden- 
+ Conditions may be imposed limiting 
| number of subgroups G and also the 
ming procedure. There remains а 
of theoretjcally admissible alterna- 
rtitions and a number of possible 
On vectors for each partition. A 
f the optimum decision vector f*in- 
the multiple choice of an optimum 
20 


373 


partition P* and an optimum set of decision 
numbers йр*. Following decision theory a 
loss function W(8, tp) is defined which 
measures the cost incurred by making the 
decision tp when the true parameter point 
is 0. The vector ¢ is determined as the set 
of decision numbers which minimizes the 
expected value of W over the domain of 
possible samples. 

The procedure is formulated in specific 
terms with reference to the following hypo- 
thetical problem: A public price control 
agency is assumed to have the task of 
setting selling prices for a given com- 
modity. The commodity is sold in K inde- 
pendent markets, each with a linear demand 
function of the form p;=20;—(1/m)ai, 
where р; is price, q; is quantity sold, and mi 
is a positive number, ((=1, * * *, K). It is 
assumed that the price control agency in 
setting prices is guided by two conflicting 
policies: (1) to be “fair” to sellers, and (2) 
to attain low administrative costs. Admin- 
istrative costs are assumed known and are 
assumed to increase with the number of 
different prices set by the agency. A meas- 
ure of “unfairness” to sellers is assumed in 
the present problem to be represented by 
the departure of actual realized returns 
under adminis ered prices from maximum 
realizable returns under free market prices. 
Under the conditions assumed the expres- 
sion for the loss function becomes W(0, t) 
=cot Diam; (tip — 00°, where the first term 
on the right represents total administrative 
costs incurred if G different prices are set 
and the second term represents aggregate 
returns sacrificed by sellers as a result of 
the administered priges tip. i 

Apart from the price agenty illustration 
but assuming a loss function of the same 
specific form, stochastic properties are 
ascribed to the set 0 and a risk function 
(expected value of the loss) defined, A limit- 
ing Bayes solution to the decision problem 
is derived (and presented in an appendix) 
under the assumption of a sequence of a 
priori distributions of 0 having constant 
density and increasing range. It develops 
that the optimum decision procedure t* is 
the following: For a sample 2 (in the price 
agency problem z; is estimated equilibrium 
price in market i) select for each value of G 
(number of different administered prices) 
the admissible partition Pg* of the К cells 
(independent markets) which minimizes the 

i sri (;— Hip), where Zip 
is a weighted sample group mean. The 
weighted means of the sample observations 
within groups formed by the optimum par- 
tition for a specified.@ give the optimum 
set ig*. Denote the above sum of squares 


€ 


374 


corresponding to P* for a specified G by 
Sg*. Then the optimum set t* is given by G 
which minimizes (cg-4-.S6*). 

This solution is applied to sample data on 
retail prices of fresh tomatoes in nineteen 
independent retail markets in California. In 
this numerical example restrictions are 
placed on the partitioning procedure which 
limit the number of admissible partitions 
for each G greater than 1, and G itself, 
limiting the maximum number of different 
administered prices to 4. In addition, a 
schedule of administrative costs cg is as- 
sumed for G=1, +++, 4, and 7; is assumed 
uniform for all markets. The resulting opti- 
mum set of administered prices is presented. 
Tt turns out that the optimum number of 
different prices in this case is 4, the maxi- 
mum allowable. 

In a concluding part the author offers 
some comments with respect to limitations 
and possible extensions of the proposed 
pooling procedure and discusses briefly the 
relation of the pooling problem to certain 
other subjects such as the problem of aggre- 
gation discriminant analysis, classification 
problems, and factor analysis. Ivan M. 
Ler, University of California. 


Fox, Karl A., “A spatial equilibrium model 
of the livestock-feed economy in the 
United States," Econometrica, 21 (1953), 
547-66. 

In this paper the author obtains:numeri- 
cal solutions for a ten-region spatial equi- 
librium model for feed grain of the type 
formulated by P. A. Samuelson in “Spatial 
Price quilibium and Linear  Pro- 
gramming,” The American Economic Re- 
view, June, 1952. The problem is not formu- 
lated explicitly in a statistical estimation 
framework, although actual data and ap- 
proximate relations derived from, actual 
data are employed in the computations. The 
paper serves primarily to illustrate unique 
numerical solutions to the spatial equi- 
librium model for a particular simplified 
formation of the problem and under alterna- 
tive assumptions regarding the magnitudes 
of certain variables entering the model. 

Quantitative results showing regional 
equilibrium prices, net trade, and regional 
patterns of trade for conditions with respect 
to freight rates, feed supplies, livestock 
numbers, and livestock prices approximat- 
ing those for 1949-50 are shown. Solutions 
are also shown for two alternative sets of 
conditions: (1) Conditions like 1949-50 ex- 
cept that freight rates are assumed uni- 
formly 50 per cent higher and (2) conditions 
like 1949-50 except that feed supplies are 
assumed lower in the Corn Belt and North- 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


ern Plains regions by 30 per cent and 75 per 
cent, respectively. An “application to fore- 
casting” is also illustrated, which differs 
from the problem solved for 1949-50 condi- 
tions only in that July, 1947 forecasts and 
December, 1947 estimates of feed grain pro- 
duction are used in arriving at regional feed 
grain supplies. Ivan M. LEE, University of 
California. 


Gulde, Harold J., “Acceptance sampling by 
variables using the range,” Industrial 
Quality Control (November 1953), 18-95. 

Formulas and charts are presented for 
constructing yariables sampling plans using 
therange and average range, The results are 
based upon the Normal approximations. 
Geratp J. LrEBERMAN, Stanford Uni- 
versity. 


Horsnell, G., “The effect of unequal group 
variances on the F-test for the homogeneity 
of group means,” Biometrika, 40 (1953), 
128-36. 

The author extends the 1951 investiga- 
tions of David and Johnson (Biometrika, 
38: 43), using four groups in a single 
classification analysis of variance. Most of 
the computations are based on three equal 
within-group variances with a fourth vari- 
ance three times as large. Assuming equal 
group means, the actual significance prob- 
abilities corresponding to nominal 5% and 
1% significance levels of F are obtained for 
equal and various combinations of unequal 
numbers of observations per group. 

The power of the usual F-test to detect 
one divergent group mean (out of the four) 
is approximated both when the variable 


“group has and has not the divergent mean. 


The author summarizes his results a8 
follows: If there is no very clear informa- 
tion as to heterogeneity, use equal group 
frequencies. If it is known that one group 
ig more variable, make sure that this group 
does not have less observations than the 
others; if possible, take a few extras in this 
group. М 

An alternative to the usual F-test is also 
mentioned. R. L. AwpEnsow, North Caro- 
line State College. 


Kamat, A. R., “On the mean successive dif- 
ference and its ratio to the root mean 
square,” Biometrika, 40 (1953), 116-27. 

Given a sequence of n normal variates 
[Xi] with means {m} and common variance 
о?. The mean successive difference is 


n3 | a = ze | 
* „= -e 


i=l n—1 


STATISTICAL ABSTRACTS 


The first four moments of d/o are derived 
when all m=p. The standard deviation, 
В, and б; values of d/o are tabulated for 
n=8 (1) 10(5) 30, 40, 50; a type I Pearson 
curve was used to obtain approximate 
upper and lower 4, 1, 25 and 5% percentage 
points, Exact results are available for n=3. 
Empirical sampling was used to check some 
results. 

The usefulness of d over the standard de- 
viation, s, is shown when the и shift. 

The distribution of W —g/s is considered 
when all =p; its mean, standard devia- 
tion, f and f» values are given for n 
=5(5)80, 40, 50. Approximate percentage 
points are also presented. R. L. ANDERSON, 
North Carolina. State College. 


' Kamat, A. R., ^Incomplete and absolute 


moments of the multivariate normal dis- 
tribution with some applications,” Bio- 
metrika, 40 (1953), 20-34. 

Given n normal variates, {2; }, with zero 
means and given variances and covariances. 
Absolute moments are moments of ( |z;| ]. 
Exact results are given for n=2, 3, and 
power-series for derivations when n>3. 
These results are applied to the distribution 
ffs 2|2;|, using Pearson-curve approxi- 
mations. R. L. Anperson, North Carolina 
State College, 


King, E. P., ^Estimating the standard de- 
Viation of a normal population," Industrial 
Quality Control (September 1953), 30-33. 
The efficiency of various estimates of the 
Standard deviation of a normal population 
18 presented when nk observations are 


grouped into k subgroups of n observationse 


Garap J, LIEBERMAN, 
versity, 


Stanford Uni- 
Latscha, R., “Tests of significance in a 2X2 
fontingency table: Extension of Finney’s 
able,” Biometrika, 40 (1953), 74-80. — ^ 
Finney’s 1948 table (Biometrika, 35: 148) 
2X HORS exact tests of independence in a 
contingency table with fixed border 


totals is extended tj го =20. This 
ble i hi А 
table i ugh A= 2 


Number of 
Total 
Successes | Failures 
Series I 
Series TI P. $68 А 
Total | r=a+b | А+В—т | АВ 


ec 


375 


The experimenter defines a success and 
Series I such that А2 В and a/A=b/B. 
For a given A, all combinations of B and a 
are included which will give a significant 
result, where the significance probabilities 
аге а= .05, .025, .01 and .005 (one-tailed). 
For each combination of A, B and a, the 
author gives the value of b (<a) which is 
just significant at the o-level, ba, and the 
exact probability that bSb,, for a given A, 
B and т. R. L. Anperson, North Carolina 
State College. 


Lindley, D. V., “Estimation of a functional 
relationship,” Biometrika, 40 (1953), 47-49. 

It is desired to estimate a’and В in the 
linear model V=a+8U, where the only 
observations on U and V are subject to 
error and all distributions are normal, U 
and V are allowed to be either mathematical 
variables or random variables. If the ob- 
servations on one of the variables are at pre- 
determined fixed points, the so-called “con- 
trolled” observations, it is demonstrated 
that the estimating equations for a and B 
can be obtained in the usual way. This re- 
sult also applies if these “controlled” ob- 
servations are allowed to be random. If 
there is no “control” on either set of ob- 
servations, the estimation is deemed impos- 
sible without recourse to supplementary 
information. D. C. Hurst, North Carolina 
State College. 


Moore, P. G., “A sequential test for random- 
ness,” Biometrika, 40 (1953), 111-15. 

Given a sequence of observations, where 
each observation falls into one of two alter- 
native categories. Assequential procedure is 
presented for testing the hypothesis, Ho, that 
these observations occur in random order, 
against the alternative Hi, that there is de- 
pendenve of the kind found in a simple 
Markoff chain. The procedure is illustrated 
with annual rainfall data. C. E. GATES, 
North Carolina State College. 


Scheffé, Henry, “A method for judging all 
contrasts in the analysis of variance,” Bio- 
metrika, 40 (1953), 87-104. 

The author presents and proves the 
validity of a method for making further in- 
ferences about the contrasts among а set of 
true means or main effects ш, Ha °°“) Mh 
following the rejectióli of the hypothesis 
Hin=m= 555 EHk by the conventional 
F-test with k—1and v degrees of freedom in 
the analysis of variance. The method is 
based on a probability statement concern- 
ing the infinite totality of contrasts of the 
type 


e 


as a ans 


376 


k 
0= > ош 
=з 
where the с; are any known constants, satis- 
fying the condition 


k 
Dia =0. 


i=l 

If the assumptions usual in the analysis of 
variance hold and 8 and 272 denote the esti- 
mates of арӣ the variance of 6, then it is 
proved the probability is 1—a that the 
values @ of all possible contrasts simultane- 
ously satisfy 

9 — 3% $0 56+ 5% а) 
where S? із k—1 times the upper a point of 
the F-distribution with k—1 and v degrees 
of freedom. The result holds for any values 
of the unknown parameters. The result 
may be used for the interval estimation of 
any contrast of interest or to declare any 
estimated contrast "significantly different 
from zero" or not, according as the corre- 
sponding interval (1) excludes 00 or not, 
contrasts suggested by the observed means 
Ji; included. 

A method has been prescribed by Tukey 
for the special case where all the 7; have the 
same variance and all pairs Ду, Шу (i747) have 
the same covariance, and where there is 
interest in only a subset of the totality of 
contrasts, namely the (1/2)k(k—1) differ- 
ences ш;— ш}. This latter method is shown 
to have greater sensitivity in this situation 
than the method based on (1) in the sense 
that the confidence intervals are shorter, 

The author examines the operating char- 
acteristic of the method by determining (i) 
the probability that all contrasts whose 
true values are zero would be declared “not 
Significantly different from zero,” and (ii) 
the probability that all normalizéd con- 
trasts whose true values are greater in 
absolute value than some specified bound 
will be declared “significantly different 
from zero." D. G. Horyrrz, North Caro- 
lina State College. 


Stevens, W. L., 
transformation," 
70-73, 

The angular transformation is defined as 

9 = 50 — 4/1 are sin (1 — 2p), 
where p is the fit Pu proportion for a 
binomial population. Values of 6 are given 
to 8 decimal places for 
2 = 0(.0001).02(.001).5, 

and for selected proper fractions. R. L. 
AwpERSON, North Carolina. State College. 


> 


“Tables of the angular 
Biometrika, 40 (1953), 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


Stuart, A., “The estimation and comparison 
of strengths of association in contingency 
tables,” Biometrika, 40 (1953), 105-10, 

A modification of Kendall’s rank correla- 
tion coefficient, te, is proposed for measuring 
strength of association between two char- 
acteristics in an rXs contingency table, 
Conservatively approximate confidence lim- 
its are set for the population association, 
An approximate test is provided for the 
difference between the coefficients cal- 
culated by this method for two rXs con- 
tingency tables. The coefficient used is 

в-1 m 


fom te 

n m—l1 
where ta designates Kendall's coefficient of 
rank correlation. 8. WEINER, North Caro- 


lina State College. 


Tanner, J. C., “A problem of interference 
between two queues,” Biometrika, 40 (1953), 
58-69. 

Imagine a bridge of certain length AB. 
only wide enough to admit a single lane of. 
traffic, with traffic arriving at both ends. 
Vehicles V; arrive at A at random and at 
ап average rate qı per unit time. A par- 
ticular Vi takes time o; to cross AB, but 
may enter AB only provided no V; vehicle 
is in AB, and if no other V; vehicle has 
entered AB during the previous time fi. 
Similar sets of conditions, оз and fz affect 
the V vehicles arriving at B. The mean de- 
lay time w, and ws for ofi and o7 fs is 
obtained as a function of tı and tz, the time 
for a block of Vi or V2 vehicles to cross, and 
7: and rs, the number of vehicles waiting to 


«enter the bridge. Explicit solutions are made 


available for the special cases 9» 0 and fi 
=$2=0. Further applications to the prob- 
lems of delays in crossing or entering a heav- 
ily travelled artery are described, and de- 
lays due to intersecting lines of traffic dis- 
cüssed. Two short tables are included, giv- 
ing the values of 101 for i — Ёз — 0, a1=a2=1 
and for ws when оз=%=0, o;—1. J. 8. 
Huxrzn, North Carolina State College. 


Whitfield, J. W., “The distribution of total 
rank value for one particular object in т 
rankings of n objects,” The British Journal 
of Statistical Psychology 6, part 1 (1953) 35-40. 

The title clearly indicates a class of prob- 
lems not appropriately handled by Ken- 
dall's treatment of the general m ranking 
problem. As an illustration one can condam 
а situation in experimeriial social psychol- 
ogy where one person in a group isinstruc i 
to'play a predetermined role and each o 


STATISTICAL ABSTRACTS 


the other members of the group is asked to 
rank his fellow members (including the ex- 
perimental person) in connection with cer- 
tain characteristics. Under the assumption 
that for the first ranking each value (1, 2, 
+++, m) has equal probability the fre- 
quency distribution for m rankings of n 
objects is constructed; the mean total rank 
value is } m(n—1), its variance is [m(n* 
=1)1/(12), =3— (6)/(5m) — 12)/I5m(n* 


ә 


377 


—1)]. Tables of exact probability values 
are given for т and n up to 8. For higher 
values of m or n [| Total Rank Value— 3 
т(п—1)| — (4)1//[т(лї—1)]/12) is ap- 
proxima: a normal deviate. For exam- 
ple, if eight judges rank seven objects, and 
the experimental object has a total rank 
value of 43; the approximation gives P= 
.0318 against the exact value of .03113. 
Hersert SoLowoN, Columbia University 


^ 


BOOK REVIEWS 


Introduction to the Theory of Statistics. Victor Goedicke. № ew York: Harper and 
Brothers, 1953. Pp. xii, 286. $4.50. 


Bernarp L. ксн, University of Leeds, England 


TE present text sets out the fundamentals of statistical method in a 
sound and readable manner and can be recommended to students who 
are starting upon the subject from the very beginning. 

Since no use is made of the calculus the author is often forced to be purely 
descriptive and to dispense with much theoretical justification. This seems 
to be inescapable when only a small prior knowledge of mathematics is as- 
sumed. The reader is not even required beforehand to be familiar with the use 
of logarithms and with simple graphs. Sections are devoted to these topics at 
appropriate points as they are needed, I feel, myself, that writers of elemen- 
tary books on statistics should not be expected to go so far as this in remedying 
deficiencies in the mathematical equipment of their readers. If a student has 
not already at least some facility with graphs and logarithms then he is, I 
believe, ill-advised to start to grapple with the theory of statistics at all. 

The decision of the author to include sections on permutations and ele- 
mentary probability, however, is not open to the same criticism and his intro- 
duction to these topics is admirable. In his treatment of frequency distribu- 
tions, the expression of frequency as a proportion per ¢ unit (where t=(# 
—#)/c) leads on naturally to the standard normal curve as a graduation of 
frequency distributions arising in many different practical contexts, Some 
theoretical justification of the general use of the normal curve is attempted 
by including a numerical investigation of the limiting behavior of the bi- 
nomial distribution. This section, however, makes more difficult reading than 
the rest of the book and could possibly be simplified in places, 

The treatment of correlation is excellent, the emphasis rightly being placed 
on the square of the correlation coefficient rather than r, itself—the idea 
being to stress that the important factor is the fraction of the variance ac 
counted for by a straight-line relation. Multiple correlation is dealt with again 
from the same angle and the examples used to illustrate it are well-chosen. 

The introduction to the sampling variability of such simple statistics as а 
sum, а difference, and a mean value is clear and leads on to the general prob- 
lem of testing statistical hypotheses where, however, some statements are 
made which are open to criticism, For instance in an inquiry where the ef- 
fects of two drugs on the period of convalescence after illness are being come 
pared, the auther gives an example where the probability is 0.0018 that a dif- 
ference between drug-means greater than the one observed could have arisen 
on the null hypothesis, He goes on to deduce that “the available evidence 
indicates that the probability that the drug does not affect the duration of 
convalescence is only 0.0018, while the ‘probability that it does affect dura- 
tion is 0.9982.” Such a statement as this is certainly unwarranted and some- 


3 378 


379 


way of explanation seems to be called for at this point. One 
course, that it is somewhat difficult to explain in an elemen- 
| what the interpretation of the figure 0.0018 should be. The de- 
iore space to the clarification of this general problem of testing 
would not be a waste, however, for as they stand, the sections on 
ffer by reason of compression. 

ding section of the book draws attention to certain common- 
oints which the novice is apt to overlook. 

ndard of printing is high throughout and altogether this is a useful 
to the texts already available on the elementary parts of statistical 


елсе, Helen М. Walker and Joseph Lev. New York: Henry Holt 
any, 1953. Pp. xi, 510. $6.25. 


Parmer О. Јонмвом, University of Minnesota 


is written primarily for non-mathematical students. The com- 
is planned for a course two-semesters in length. The problems 
| with writing text-books for the type of students specified are par- 
‘difficult if the responsibilities of the author(s) are to be fulfilled. 
е problem of what mathematics can be assumed on the part of the 
er, A second problem resides in the selection of content. Since texts of 
| discussed here are written for those who are to become practitioners 
arch workers, the most important statistical procedures must be de- 
an the computational problems arising must be discussed, The meth- 
st also be illustrated with representative examples from the field or 
application, usually with the particular field with which the aüthor 
liar, There is likewise the selection of problems ta be solved by the 
| for testing his understanding of the theory and methods. The writer 
for non-mathematical students is under particular obligation to 
€ rigorously and to indicate in his own illustrations the assumptions 
filled if the procedures are to, be validly applied. This is particularly 
tin a book whose readers lack the knowledge necessary to check the 
в statements, 
mately there are no properly designed experiments to test the ef- 
text-books on modern statistical methods. It is, therefore, difficult 
criticisms that are completely objective. Most statisticians have 
Convictions not only on the relative importance of various statistical 
3 lures but also on how mathematical (even for non-mathematical stu- 
) a text-book should be. 
le Teviewer’s judgment, the authors of this text have in general suc- 
Writinga text that reaches a high level of attainment in meeting the 
üggested above. 
thematical requirements Would be chiefly met Буза good command 


° e 


e D 


380 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1034 


of arithmetical and algebraic processes, There is considerable use of proba- 
bility symbolism, multiple subscripts, multiple summation, and specified lim- 
its of summation. In this connection, well stated explanations and practice 
exercises in the processes involved are provided. With respect to previous 
statistical training, the student with good ability and familiarity with sym- 
bolism can acquire the elementary calculation processes, such as calculation 
of the mean, standard deviation, regression, and correlation from the treat- 
ment in the text. 

Skilful use is made of intuitive reasoning and of effective graphical devices 
for developing a functional understanding of statistical concepts and proc- 


experiment. Chapter 2 lays the basis for a more formal treatment of probabil- 
ity and its role in statistical inference. Fundamental concepts are introduced, 


pling distribution, properties, the chi-square curves and tables. A number of 
applipations are made, emphasizing calculations and interpretation. No men- 
tion is made of the criticisms and limitations of the chi-square test of “good- 
ness of fit,” partfeularly that the power of the test to detect disagreement 
ne hypotheses and observations is determined largely by the size of the 
samples, 

, In Chapter 5 and in subsequent chapters consideration is given to problems 
in inference involving continuous variables. Certain general concepts under- 
lying distribution of a continuous variable are developed. Notions of the 
parent population, unbiased estimates, and a test of normality are included. 


distributions, The property of independence of statistics is well portrayed 
by plotting the joint frequency distribution of the means and standard 
deviations of samples from a normal population. Chapter 7 presents methods 
for testing hypotheses and for estimating population means and differences 
between means 9f two populations. The test of alternative hypotheses is 


D 


| REVIEWS 381 


ally well treated including the graphical interpretation of the power 
Limited consideration is given to the design of samples in surveys. 
for the mean and variances are given for stratified and cluster 
Along similar lines Chapter 8 considers the problems of inference 
icerning variances and standard deviations of normal populations, Special 
sideration is given to the use of the tables of the F-distribution. Bartlett's 
| Hartley’s tests for the equality of variances of several samples are applied 
practical problems. There is also a test for differences between variances 
elated measures.” 
pter 9 presents some of the simpler applications of the analysis of 
се. A k-sample problem of testing the equality of means is illustrated 
experiment designed to determine if order of shooting three different 
in archery had any effect. The underlying mathematical model is 
ed including assumptions, In this connection it would have been de- 
le to carry out the tests of normality and of equality of variances rather 
in merely to indicate that these assumptions had been satisfied. The test 
he equality of means is carried out by estimating the unknown population 
fiance from the variation among the three sample means and from the 
ation among the archers within groups, then by taking the ratio of the 
0 estimates, the variance ratio, or Ё, as the test statistic. This approach 
ikes it difficult for students to perceive why these sums of squares were 
ед. The approach through stating the mathematical model in algebraic 
n or in the form of a linear hypothesis seems to be more informative. 
yould entail the use of the method of least squares or of maximum likeli- 
od to obtain the appropriate sums of squares. In this chapter there is an 


Sampling distribution of F and 1/F. й : 

а Chapter 10 methods of statistical inference are applied to bivariate 
ta, A mathematical model for linear regression and the sampling distribu- 
ons of statistics in linear regression are presented as the bases for testing 


model and the normal bivariate population model are treated. The dis- 
ion of the product moment éorrelation coefficient is described. Tests 
hypothesis about p are carried out and confidence intervals for p set up. 
is another chapter (Chapter 11) given to other measures of relation- 
The chief contribution here is a test of significance for point-biserial-r 
brief summary of Tate’s unpublished study comparing bi-serial and 
nt-biserial-r, 


п. Most of the content of this chapter is of long standing and indicates 
OW little workers in this field have been influenced by the developments in 
ern statistical inference. Most of the old algebraic formulations are 


tellent explanation of the form of the F-distributien, including curves of 


ie 


382 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


reported in this chapter. Perhaps an opportunity would have been afforded 
to report some beginnings that have been made and other help that could 
be had from ideas and developments in statistical inference. Examples are 
Wilk’s Lj, criterion for parallel tests, Votaw’s test of compound symmetry, 
component mean square analyses in regression problems, the practical sam- 
pling problems of reliability and validity, and the uses that are beginning 
to be made of information theory in mental test theory and construction, 

Chapter 13 gives a lucid explanation of the methods and interpretation of 
multiple regression and correlation. Illustrations aré given of the Doolittle 
method and Fisher’s modification of the Doolittle method in obtaining a 
multiple regression equation and tests of significance. There is no considera- 
tion in this or other chapters of other methods of multivariate analyses. Thus 
the discriminant function designed for obtaining the best system of weights 
of various independent variables for distinguishing among groups of indi- 
viduals is not treated, nor are such methods discussed as Hotelling’s T' for 
testing the significance of the difference between means of multivariate 
normal populations, and methods of classification of an individual based on 
a number of measurements into one of several categories. Examples of the 
latter methods are the generalized distance function and the statistical de- 
cision function. 

In Chapter 14, the analysis of variance is extended to the analysis of two 
or more variables of classification. This chapter bases its discussion around 
data derived from studies employing modern experimental designs. In fol- 
lowing through the analysis of the designs presented, the student should 
gain a realistic conception of the role of Statistics in the analysis of the data 
from modern designs, It would have been desirable to have given more ex- 
planation to the role of statistics in the planning stage of the design and 
greater emphasis to the functions of randomization and replication. 

A few points of Criticism will be made to an otherwise admirable chapter: 


On p. 349, u in H, and H. appears to be the same value, which in general is 
not true unless the value is zero. On p. 354, the test of the hypothesis that 


ment made tying these v; lues up with Model II of the analysis of variance. 
On p. 363, recognition is given to the circumstance that when several meas- 
ures are obtained from the same individual, the measures are likely to be 
correlated. It is not clearly indicated how, if at all, the effects of the corre- 
lated observations have been removed. ( 


Chapter 15 gives special consideration to the use of the analysis of co- 


variance in group comparisons on one variable when information is available 
n another or on several other variables which are correlated with it. Skilful 
demonstration is given by the analysis of data from а numfBer of important 
research problems in this area. 
ы Д 2 :, i 
In Chapter 16, certain Special uses of petcentilés are illustrated in analysis 


A n 


BOOK REVIEWS 383 


Jeading to inferences. In Chapter 17 several useful transformations are dis- 
cussed. The final chapter discusses and illustrates the use of a number of 
non-parametric methods in testing hypothesis and in setting up confidence 
intervals. 

The book has an exceptionally large number of useful tables and charts 
(20 in all), besides a table of squares, square roots, and reciprocals, a table 
of four-place logarithms, and one of random numbers. 

There is a five-page glossary of symbols. A detailed table of contents and 
a well-prepared index facilitate the location of items of interest. All but four 
of the chapters contain a list of references. 

Eleven of the chapter# contain exercises for the student. They are most 
frequently interspersed within the chapter. These exercises are well designed 
to test understandings of terms, phrases, principles, symbols, formulas, use 
of tables, etc. Only three chapters contain exercises dealing with real data. 
To the reviewer, it seems that in courses for practitioners abundant oppor- 
tunity should be given to deal with real data—fictitious data do not often 
give experiences that will be met with in research. There are no such oppor- 
tunities in Chapters 14, 15, 17, 18, for example, which deal with the types 
of situations most often encountered in practice or likely to be used by stu- 
dents in their own research. These types of situations are likely to be most 
ps and to develop an appreciation of the dynamic qualities of statis- 
108, 

In summary, the reviewer believes that this text will make a valuable con- 
tribution by elevating the plane of instruction in applied statistics. While the 
fields of application are chiefly in education and psychology, the book could 
profitably be used as a basic text in other fields. It will particularly appeal 
to instructors in the social science fields because, во many of the current texts 
are written for students in biology and in agriculture. 


Бш Methods in Experimentation: An Introduction. Oliver L. Lacey. New 
ork: The Macmillan Company, 1953. Pp. xi, 249. $4.50. — \ 


Oscar Kemprnorne, Iowa State College 


[> text is designed for use in a one-semester course in general statistics 
as applied to experimentation, and assumes one semester of college alge- 
bra as a background. The subject matter of the book is indicated by the 
chapter headings: The aim and problems of statistics in experimentation; 
Experimental design; Interpretation; Probability (I)—the probability of 
discrete events й Probability (II)—probability in a continuum; Three chap- 
ters on the normal distribution; Tests of significance of means and differ- 
ences between means; Enumeration data; Correlation; Regression; Fiducial 
limits; Experimental design; Appendix of tables. $ 
The plan of each chapter is to give first a general discussion of the matter 
under consideration, then examples, which are in the foîm of questions, and 
-d т 


384 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


answers to these examples, and finally problems for the student. The general 
form of the book, the level of discussion of topics, and the question-and- 
answer technique strike this reviewer as being very good and very appropri- 
ate to a first course in statistics for students with no mathematical back- 
ground. The format and general style are very clear and should not be a 
source of confusion or annoyance to the intended readers, 

A number of criticisms can, however, be made. 

With respect to the design of simple comparative experiments, the discus- 
sion is somewhat naive, The author states that there are four general methods 
of controlling variation due to extraneous factors (1) elimination (2) equali- 
zation, (3) balancing, (4) randomization, The firststwo are never completely 
realizable, and the third is only realizable with special assumptions which 
may be very arbitrary. The basic fact is that the first three methods are 
valuable adjuncts to randomization and it is rare that randomization can 
be dispensed with. In any case if the physical act of randomization is not 
done the validity of the statistical inference is questionable as Fisher origi- 
nally stated. The combination of system and randomness in a design is little 
discussed, and this is perhaps the most important feature experimentally. 

‘The discussion of the size of experiment is made generally in terms of 
“minimal adequacy,” although the word efficiency is used occasionally. It 
would seem desirable pedagogically to conform to the usual use of the word 
“efficiency,” and to retain the notion of power instead of introducing a new 
notion. 

Tt would be desirable that all statistical texts make the distinction between 
regression relationships and functional relationships. Originally regression 
was a relationship in э, multivariate population, but now is widely used for 
the relationship of y to 7, where z is not a random variable. When one uses 


discussed more fully, and examples given, 

The author acknowledges the stimulation of В. A. Fisher’s The Design of 
Experiments, but certain passages indicate that there is some confusion in 
the author's mind on some basic questions. For example he gives as a sample 
question, “Does home economics training result in a better chance for 


> 


BOOK REVIEWS 385 


is given is not an answer to the question posed. The question posed can be 
answered strictly by giving girls home economies training or arts college 
training at random, without regard to their inclinations or anything else 
and then observing their marital success. The answer which would be ob- 
tained by the stated procedure is the answer to the question “Do home 
economics graduates have a better chance for marriage than liberal arts 
graduates?” The lack of appreciation of the difference between the question 
posed and the question answered appears to be rather widespread in the 
social sciences. Also there is a question about the answer: what is meant by 
“truly comparable?” The adjective “truly” appears to have taken on some 
mystic significance with some scientists and the use is better avoided. In 
. addition the reviewer does not like the use of the word “experimentally” in 
the answer, in that one is merely observing an uncontrolled situation. Nor 
does the reviewer favor the use of “identical” in statements such as “we 
should present them (two drinks), therefore, in identical glasses” p. 3 (cf. 
the use of “equal” on page 18). 
Writing really good elementary texts in statistics is a very difficult job. 
The present book is a good attempt, and would be more than good if the 
defects mentioned above were corrected. 


Methods of Statistical Analysis in Economics and Business. Edward E. Lewis. 
Boston: Houghton Mifflin Company, 1953. Pp. viii, 686. $5.50. 


Z. Szatrowsx1, University of Buffalo 


D THE opinion of the reviewer, Professor Lewis has written a definitely 
good book on statistical methods in business and economics. He has done 
this by adding to the conventional material some of the fundamental con- 
cepts of modern statistical analysis. His discriminating selection of subject 
material and careful allocation of spaée according to the importance of basic 
topics enables him to cover more and get it across. The exposition (text, il- 
lustrations, and problems) is excellent. 2 

The organization of the book is reasonably conventional and departures 
ftom the pattern of the past are improvements. There are 686 pages, five 
Parts and an appendix. Part I (pages 1-96), the introduction, discusses the 
Purpose and problems of statistical methods. Also this section includes an ex- 
- planation of graphs, tables, and “Statistical Numbers and the Problem of 

Accuracy.” Part II (pages 97-186) presents descriptive statistics for fre- 

quency distributions, i.e., measures of central tendency including the geo- 
_ Metric mean, measures of dispersion, skewness, and kurtosis. The application 
of these descriptive statistics is explained clearly. The illustrations involving 
Comparisons are very effective. In connection with this section, the reviewer 
Would like to call attention to an inaccuracy in the specification of class 
limits and mid-points. The author’s presentation results in a bias. To il- 
lustrate, on page 67, Table.6, for weekly earnings recorded to two decimals, 


° 


386 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


the first class is given as 22.00-23.99 with mid-point 23.00. More accurately 
it should be 21.995-23.995 with mid-point 22.995. 

Part III (pages 187-298) is entitled "Statistical Inference." Here are 
contained various tests of significance, for small as well as large samples, 
including the application of the F and x? Test. This section concludes with 
& discussion of quality control methods. In the reviewer's opinion, this is an 
outstanding presentation. The problem of inference is given proper emphasis, 
The various applications, which in so many texts are scattered throughout 
the book, now appear іп one section. The author succeeds in covering a great 
deal in this part because he has a logical unified organization of the material 
and because he develops each well-chosen illustyation of inference just far 
enough to present the idea clearly. 

Part IV (pages 299-370) very adequately discusses the subject of index 
numbers, their construction, the problems, and current applications, Part V 
(pages 371—486) deals with time series analysis, trends, cycles, and seasonals, 
including the changing seasonal. This section is not involved but thorough. 
Tn connection with trend determination, the author includes a brief discus- 
sion of the use of transformations, so widely applicable in non-linear trend 
problems. Part VI (pages 487-649) is a clear presentation of correlation, 
multiple as well as simple, including tests of significance. 

The Appendix in this book contains a list of references, formulas, and 
tables of square roots and logarithms. Tables of the Normal, t, F, and x? 
distributions appear in the text where applications are discussed. 

The book contains an adequate number of problems which appear through- 
out the chapters to illustrate specific topics which the author has just dis- 
cussed, In general, the problems are well chosen and involve the minimum 
of calculation. The format of the book is in keeping with the high quality of 
the contents, y 

An instructor using this book can 'expect his students to get much out 
of the text. Also he can expect many of his students to become interested in 
statistics. This book should enjoy success as a text. In addition, since it is 
easy to understand and relatively complete and up-to-date in its coverage, it 


oe Serve as a very useful reference» book in business and economic sta- 
istics. 


Sampling Inspection by Variables. A. H. В York: 
McGraw-Hill, 1952. Pp. xi, 216. $5.00. Cont eiii 


Н. C. Hauaxzn, Philips Research Laboratories, Eindhoven, Holland 


[ first thing one notices about а new book usually is its title, and it is 
i therefore important that this title should convey a concise and correct 
Impression of the content of the book it covers. "The title of the book under 
review does not quite satisfy this criterion and I am afraid that many statis- 
ticians who ordered the book because ofsits attractive title, will feel slightly 


з 
> 


387 


does not bring a general and comprehensive survey of 
variables" and the variety of problems involved; the book 
es “a particular system of sampling by variables” to be used 
bute sampling plan according to the tables of Sampling Inspec- 
tatistical Research Group, Columbia University, or the Military 
A is to be replaced by a variables sampling plan. 
¢ pattern of the sampling plans described is that given by Wallis 
‘1 of the Statistical Research Group's Techniques of Statistical 
he quality »characteristic of the products inspected is sup- 
measurable on a variables basis; an upper or lower limit, U 
t, and we wish to control the percentage of items with a quality 
of these limits. To this end we measure a sample of n items, 
‘the average = and the standard deviation s from these measure- 
id then require 
? Z+ks < Uorz — ks > І, (1) 


e may be, k being a constant to be chosen in relation to practical 
nts. 

zle sampling technique is now supplemented with double sampling 
h are specified by three constants ka, kr, and К, and two sample 
d п». When after the first sample 


£A + kas: < U, the lot is accepted; (2) 


& + k,s: > U, the lot is rejected. 


ns (2) is satisfied, that is when ° 


fo hb ks < U «8d katis ^ (3) 
ceed to take a second sample, the requirement being that ultimately 
2, + kis: < U for acceptance (4) 


and з; are average and standgrd deviation computed from the pooled 
ments of first and second sample. Sequential sampling according to 
principle would become complex and inconvenient in practice, and 
been included for that reason. 

bove criteria hold if we assume the standard deviation с of the lot 
m and variable from lot to lot. If past experience has shown that c 
lot vary from lot to lot, known-sigma plans may be used by replacing 
in inequalities (1) to (4) and modifying the constants he. т 
separate chapter is devoted to cases where we have to consider 
usly an upper and a lower limit. SEGA 

hoice of a sampling plan according to these principles is facilitated by 
ive set of tables and chants, which have been arranged after the 


° 
э e 


388 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


model set by Sampling Inspection of the Statistical Research Group and its 
offspring the Military Standard 105A. 

After a choice has been made between three inspection levels, the plan is 
determined by lot-size and AQL-class. A full set of charts with operating 
characteristics of all sampling plans has been included, while additional 
tables are provided for finding the AOQL, or LTPD, and for deciding on 
tightened or reduced inspection. 

The text accompanying these tables gives full instructions as to their 
meaning and application. One chapter of 17 pages dealing with the theory 
requires considerable mathematical knowledge for its understanding, but 
all other chapters have been written ina simple and clear style, while mathe- 
matical formulas haye been largely avoided. 

Yet despite the absence of formulas the text is mainly theoretical in 
concept, and many of the puzzles encountered in practice are not even men- 
tioned as the following examples may illustrate. 

The limits U or L, see above, are taken for granted, whereas it may often 
be an important preliminary step to decide by a statistical investigation 
whether they have been fixed in a reasonable manner. 

It is repeatedly stressed that a first step is to decide what is the item to be 
inspected. In most cases, however, this is obvious enough, but it is more diffi- 
cult to decide how the measurements are to be performed. For instance, in 
the case of metal sheets (pages 86-93) the thickness and Rockwell hardness 
may easily vary from the centre to the border of the sheets and if so it may 
be difficult to decide how and where the measurements have to be per- 
formed. Similarly special measures may sometimes be required to avoid 
systematic differences between different inspectors or apparatus. Such points 
have not been considered, 

The sample size for knówn-sigma plans is about one-fourth that for un- 
known-sigma plaris. One of the most useful functions of a’control chart used 
in combination with sampling by variables would therefore be to indicate 
when sigma is sufficiently under control for the use of known-sigma plans; 
but chapter 9 provides no guide on this point. 

In practice, lot sizes may vary considerably and sample sizes with them. 
But a control chart for varying sample sizes loses a great deal of its at- 
tractive simplicity and, therefore, of its technical importance. Hence if we 
are to derive full profit of the control chart technique in combination with 
sampling by variables a constant sample size may be of tremendous ad- 
vantage, We should not forget the Lot-Plot technique, which has taught us 
the Practical value of a constant sample size, and a simplified technique for 
computing the standard deviation, $ 

On two points of methodology I disagree with the authors. 

The standard deviation of a sample is defined by 


ез: 
2 “=D: 


С BOOK REVIEWS 389 


using n—1 in the denominator instead of n as is more common. This, I think, 
is a happy change; in more advanced applications of statistics the concept 
of the number of degrees of freedom is so fundamental that it should be 


introduced at an early stage, if we are not to be led into considerable con- 


fusion and contradiction later on. But the standard error of s as defined 
above is i 3 
Á e, = c/ V2(n — 1) 


and not c/4/2n as stated оп page 107; the factors Bs and B, for control 
limits in table J should be corrected accordingly. 

More serious objections «вап, I believe, be raised against the techniques 
recommended on pages 62 and 63 for finding the process average. Let us 
suppose that the critical limit U is 20, and that we receive 19 lots with 
average и =17 and standard deviation c —1, and one lot with p —25, o=1; 
19 lots are completely good, 1 is completely bad, the process average is 5%. 
But if we pool these 20 lots into one grand lot, this grand lot will have an 
average и = 17.4 and a standard deviation c =2.0. 

Should we consider this grand lot to be normally distributed, we would 
conclude that the process average per cent defective is 9.7%, a gross over- 
estimate. It might be objected that nobody would ever consent to treat 
such a heterogeneous mixture of lots as one lot with a normal distribution, 
but this is exactly what is recommended on the pages cited. Samples drawn 
from successive lots are pooled into.one grand sample, and the mean and 
standard deviation of this grand sample are treated as if they represent a 
grand lot with a normal distribution. This is only correct if the lots are under 
control, but then sampling inspection is not needed. Sampling inspection 
fundamentally applies where there is not sufficient control; in that, case 
within-lot and between-lot variation should be clearly distinguished, and 
Pooling of data from different lots is mot permissible. The*correct procedure 
is to estimate the percentage defective for each lot separately and average 
these percentages. 

Summing up them, where Military Standard 105A or the Statistical Re- 
Search Group’s Sampling Inspection for attributes is to be replaced by sam- 
pling by variables, the reader will find in this book an immediate guide; 
While he who wishes to develop a system of his own, will find in it a great 
amount of information, which will be useful if combined with practical ex- 
Perience and ingenuity. 


os Introduction to Statistical Science in Agriculture. D. J. Finney. New York: 
ohn Wiley and Sons, 1953. Pp. 179. $3.75. eu 


H. W. Norton, University of Illinois 


1а little bodk has chapters entitled “The need for statistics,” “Some 
problems of rates and frequencies," “Probability,” “Properties and MED 
of distributions,” “An experiment to compare two varieties,” “The reduction 
of error,” “Factorial design,” “Sampling,” and. “Correlation and regression.” 


a e 


390 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


Tn addition, there is a preface, a valediction, references, and index. Its object 
is “to impart a knowledge of the general principles on which statistical 
science is founded and of the manner in which it enters into so many agri- 
cultural problems . . . the emphasis throughout is on the principles illus- 
trated by the examples rather than on the arithmetic and algebra of 
calculation.” I am one of that group of statisticians (mentioned by Finney) 
who think the correlation coefficient is now of very trifling importance, es- 
pecially in an introductory book. This topic can hardly have much of the 
merit of familiarity attributed to it by the author; and the space devoted 
to it might better have been used to give a simple example of analysis of 
covariance. E 

A number of formulas are given and used arithmetically, but there is no 
algebra nor other mathematies with the sole exception of the derivation of 
a formula for the sample number required to realize a specific confidence 
interval. Type, paper, and binding are good and there are few typographical 
errors, 

Itis a curious fact that there appears to be a real need for nonmathematical 
books on statistics. That is, it is curious that there are many people who are 
established scientists, and many others offering to become scientists, who 
have so little mathematics (in spite of a great need thereof) as to warrant 
the preparation of severely nonmathematical books expounding an essen- 
tially mathematical subject. It is reassuring in a way, in the midst of current 
attacks on education in the public schools, accompanied sometimes by asser- 
tions that standards are higher elsewhere, to note that this book is based on 
(and presumably is the substance of) a “course of eight lectures . . . to 
undergraduates reading agriculture at Oxford.” I have long held the view 
that the research scientists most urgent statistical need is for an appreciation 
of the essential ideas of the design and analysis of experiments, and that 
those ideas can bé appreciated without recourse to mathematics. Therefore 
I expect Finney’s book will probably have wide appeal, and it seems well 
suited to its purpose of conveying appreciation of principles. 

An important feature of the book is its stress on the relation between the 
research worker and the statistician,*and the continual emphasis on the 
desirability of expert statistical advice at all stages of the work but especially 
before experimentation is begun. This culminates in the “Valediction,” with 
its twelve rules of respectable statistical conduct for research workers. The 
first is “When you propose to undertake an experiment or a sampling investi- 
gation of a kind that presents any novelty to you, consult a statistician at 
an early stage of your planning.” This will serve as an example while enabling 
ne to point ou$ my only substantial criticism of these rules, that the phrase 
| i a kind that presents any novelty to you” should have been omitted on 
di LAM that recognition of relevant novelty will often require considers- 

y an experienced statistician, I was glad to find (p. 126) Finney ® 
observation, “even a year. . . is not amexcessiye margin, for consideration 


„ 


BOOK REVIEWS 391 


ofthe relative merits of different designs takes a long time and is better done 
in short spells over a period than in a single concentrated effort." How often 
the statistician has occasion to agree! 

There are a few statements to which exception must be taken, and which 
do not appear to be merely the result of such simplification (usually called 
oersimplification) as reasonably would be expected in such a book. The 
assertion (p. 25) that Yates’ “correction is always 3; and (p. 33) “applies 
only to 2X2 tables” may be defensible in a chapter which emphasizes 
contingency tables, but is likely to prove confusing to the novice who looks 
into more than this one book, The statement (p. 156) that the covariance 
“must always be intermediate in magnitude between” the two variances is 
wrong: it may not be larger in magnitude than the square root of the product 
of the variances. The statement (p. 102) that “both the smaller dressing of 
sulphate of ammonia and the ammonium humate gave yields significantly 
higher than that without nitrogen” may prove disconcerting to the careful 
reader, because the differences mentioned are exactly at the 5% point (fol- 
lowing Finney’s practices in rounding) for the comparison of yields in pounds 
per plot, but do not quite reach the 5% point for tons per acre, and retention 
of additional decimals leads to the conclusion that significance is reached 
only by accumulation of rounding errors. Occasionally the choice of words 
isinappropriate as (p. 75) the use of “indubitable” to describe the superiority 
of one variety to another after their difference of yield has proved significant. 
fome statements would be improved by minor changes, such as (p. 60) 
conditions that are as alike as possible,” which is likely at once to coincide 
with the readers’ prejudice and to miss an opportunity to emphasize the 
potential importance of a proper distribution of effort, in experimentation, 
Would better say “as practicable” or “as reasonably possible,” and (p. 68) 
the difference is greater than would occur by chance” needs to say “would 
be at all likely,” so as to keep always before the novice the idea that any 
difference may have occurred purely by chance. I could not verify the 
Probability near the top of p. 57, nor the fiducial limits on p. 159. There is 
4 misprint in table 6.1, where the mean for block V should be 19.0; on page 
98, where the divisor in the formula for F should be 3.0; and on p. 168 the 
references are to table 9.3. 


qiricultural Prices. Frederick Lundy Thomsen and Richard Jay Foote. New 
ork: McGraw-Hill Book Company, 1952. Pp. xi, 509. $6.50. 
© 


L. J. Norton, University of Illinois x 


IS is a thorough revision of a 1936 book by the senior author. It is a 
text book intended for students principally in agricultural colleges. It 


bears the imprint of extended experience in teaching, government economic 


a 
e e 


* 


392 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 195i 


research, and practical experience in business. It is divided into three parts: 

1. A review of simple economie principles, discussion of some materials, 
presentation of the elements of economie fluctuations, and an analysis of 
government price programs under the over-all heading of *price determina- 
tion and discovery." 

2. Price analysis and forecasting. 

3. A series of commodity reviews, labeled “commodity prices." The em- 
phasis is on simple, direct tools rather than on complex mathematical 
methods. y 

Economic statisticians, particularly those engaged in commodity analysis, 
will find Section II of particular value. The chapters on commodity prices 
are not up to the standard of those in the first two sections in getting the 
problems into clear focus, 

The authors hold a hard-headed point of view with reference to the value 
of overly elaborate analysis in forecasting. Their point of view might be 
summarized: Get all the information you can out of the best analysis you 
know how to make. Test your results with all the logic and common sense 
you have. Apply them with judgment and be on the alert for facts not used 
in your basic analysis, From his observations on forecasters and forecasts 
this reviewer deems this to be a highly sensible point of view. The authors 
suggest that pin-point forecasts are not needed but rather “a degree of inter- 
pretation which will enable the user to anticipate the direction and to some 
degree the extent of the movement” (p. 351). [ 

In discussing price determination both the aggregate approach and the 
older individual commodity approach are used. No time is spent on hair- 
splitting differences dr too-elaborate types of analysis, The students should 
get the points. Perhaps the terms supply and demand are used too broadly 
and so lack precision, but this may be justified on the grounds of teachability. 
The discussion of the “relation between cash and future prices” is not up t0 
the standard of the book, Quite possibly many traders in the future’s markets 
have better bases for formulating expectations than this treatment would 
lead one to believe, In some form or other many of them have grasped some 
of the ideas which the authors later develop as to reasons for price trends. 
In discussing the general price level too much emphasis is put on banking 
and credit. Perhaps the authors do not infer causation but their treatment 
would leave Such an impression in the reader’s mind. After a period when 
monetary inflations, devaluations, and many other developments have af- 
fected the levels of prices in various parts of the world, their analysis seems 
oversimplified, This may be important because in connection with the 
general price level the price analyst may find an important element which 
pi eh Omitted from his Statistical calculations, In this area the 

r: seem to have done work in depth as in othor phases of price 
analysis, 

Qu tha whole this is a good'job and both students and workers in the field 
of price analysis will find it valuable, 


> e 
^» 


+ 
BOOK REVIEWS 393 
Textbook of Econometrics. Lawrence R. Klein. Evanston, Illinois, and White 
ins, New York: Row, Peterson and Company, 1953. Pp. ix, 355. 

Kennets J. Arrow, Stanford University 


mus appearance of this book marks an important stage in the develop- 
1 ment of econometrics, since for the first time there is available a useful 
book. To say that Klein's work is the best of its kind would be correct 
very inadequate, since he has virtually no competition. Some other 


n right, but they do not meet the need of a beginning course in econo- 
trics (as opposed to mathematical statistics) presupposing a reasonable 


conometricians are very fortunate indeed that this text, which will un- 
btedly be the standard one for many years, is so truly excellent. Klein’s 
eminence as a practitioner of the econometric art is well known, but one 
would not necessarily expect such an extraordinary level of didactic skill 
from a mathematically-inclined economist (although the readers of The 
Keynesian Revolution might not be surprised). 
After an introductory chapter on the econometric approach, there is a 
y-page summary of the basic principles of mathematical statistics. While 
ut as good a job has been done as is possible in the space,! I seriously 
doubt the usefulness of the attempt. The material the author seeks to cover 
Tequires, I would judge, about one semester to master, and no attempt to 
Speed up the learning process is apt to be successful. I think it preferable to 
quire a basic course in mathematical statistics as a prerequisite. 
Chapter III, entitled, “Estimation of Aggregative Models,” is in many 
ects the core of the book. A simplified macro-economic model is devel- 
d, and the method of least squares'is then introduced as the maximum- 
hood method of estimating one of the equations. In more general cases, 
noted that the problem of identification arises; the concept is rigorously 
ned and its application to linear models stated. The estimation of an 
Tegative model by the method of maximum likelihood applied to the 
hole system is then considered, with attention paid to the special cases 
here the statistical method has a simple form, The method of instrumental 
ables and the limited information method are next introduced, though 
e explicit derivation of the formulas from maximum likelihood considera- 
ons is omitted. The chapter concludes with a discussion of confidence inter- 
for the estimated parameters. Я 
‘The next chapter is a detailed description of the computational designs of 
16 least squares, maximum likelihood full-information (under the assump- 
On of a diagonal covariance matrix of the disturbances), and maximum 
kelihood limited'information methods. The exposition is first-rate; for a 
dent, it could hardly be improved upon. B 
1 $ T Some minor errors in this chapter have been noted in the review by R. Solow, American жо 
we Volume 43 (1953), 947-50. T 


e е 


396 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


the measurement of income or wealth, with the meaning of the concepts 
measured and with the interpretation or analysis of the measures. Most 
economists and statisticians engaged in studies of income and wealth are 
concerned with basic problems of measurement, the formulation of concepts 
and definitions, the accumulation of data from diverse sources ranging from 
administrative statistics to family living surveys and the adaptation of 
these data to the conceptual frame-work of a particular system of accounts, 
The greater number of entries in Volume II, as in Volume I, present national 
estimates or survey data on income, wealth, and related topics for various 
dates. e 

Tn general, the estimates of national aggregates seem to be conditioned in 
all countries by the same limitations imposed by the nature of the source 
material, and the supply of data, and by the financing of the necessary re- 
search. The first of these conditions is apparently leading towards some uni- 
formity in the details of the national accounts for various countries under the 
influence of the various recommendations of the United Nations' Statistical 
Commission. The second explains the variation in the volume of estimates 
among countries in possession of the source data and the technical skill 
required for their utilization in the construction of systems of national ac- 
counts. 

The discussions of concept in the writings listed in Volume II display, on 
the other hand, no clear tendencies towards the resolution of the controver- 
sies that have filled the literature on national accounts for the past two deo- 
ades or more. From the viewpoint of a statistician with a bias towards em- 
pirical methods the controversial issues can not be settled until they are 
submitted to a thorough logical analysis. The operational nature of the basic 
concepts needs clarification; the limitations on measurement should be recog- 
nized; and the distinctions between she concept specified for measurement 
and the theoretical construct of the same name should be sharply drawn. 
Until the logic of this field of measurement has been considerably refined, 
the contribution of statisticians to the improvement of the quality of the 
primary estimates will be seriously restricted. A few entries in Volume Ш 
discuss the potential usefulness of sample surveys for the collection of data 
needed for the purposes of national accounts and the need for developing 
measures of the error in estimates. The statistician’s skills can not, however, 
be effectively employed until the basic concepts are made to correspond 
more simply with magnitudes that can be directly observed than is now the 
cage, 

Basically the concepts specified in national accounts are synonymous with 
the operations for determining the particular measurements. National income 
analysts have adopted the tools of the accountant and are developing rules 
for their application to data from diverse sources relating to the economie 
activities of the productive units in the particular country. The productive 
units fall into three classes, business enterprises, households, and gover 
ments. The economic activities of the three classes of units are not distinct 


* 5 3 


OK REVIEWS Á 397 


d they overlap in different ways and in varying degrees among the nations 
w attempting to accumulate national income estimates and other social 
counts, The procedures of accounting apply directly to one class of units, 
ess enterprises and such concepts as wealth, income, production, con- 
mption, and capital formation can be differentiated in the records of the 
nomic activities of these units. More or less arbitrary but fairly standard- 
d rules have been devised for fitting such activities of other classes, house- 
ids and governments, into the same structure of measurements as overlap 
th or substitute for the activities of business enterprises. Thus the food 
arm family consumes that otherwise might be sold by the farm business; 
home occupied that might otherwise be rented by the owner as a land- 
the goods and services provided by a public authority that elsewhere 
produced by business enterprises can be described by essentially the 
structure of accounting concepts. 2 
Difficulties arise when operational definitions are extended well beyond 
domain of experience with their application. The Bibliographies indicate 
arly that the scholars engaged in the measurement of national accounts 
e, as of 1949, still unwilling to cope with the fundamental limitations on 
extrapolation of concepts beyond the region of their applicability. The 
rring of concepts outside of the original area of definition which occurs in 
ill fields of measurement is particularly well illustrated by the problems en- 
untered in describing the economic activities of “subsistence” units for the 

rposes of national accounts. The differentiation between production and 
consumption, saving, wealth, income, and similar concepts cannot be simply 
ablished by adopting the operational definitions that serve for the aggre- 
tion of the accounts of business enterprises. Volume ТЇ of the Bibliography, 
Volume I, lists very few references to any attempts at measurement of 
activities of subsistence units. The,small number of entries does not indi- 
ate, however, that the analysts have rejected this domain as not susceptible - 
Measurement in terms of prevailing concepts. There are as many refer- 
ences which list the failure to measure production for home consumption as 
element affecting the accuracy of comparisons of national income esti- 
nates among countries and between dates. The absence of “data,” not a con- 
lderation of the validity of concepts in this domain, probably explains the 
mall amount of research on this subject. 


le area of activities overlapping with business enterprises, present the diffi- 
ties of extrapolation to another scale. Government accounts and balance 
ets are not easily summafized to yield the concepts specified for aggre- 
ed accounts of a whole economy. There are a fairly large number of direct 
eferences, some twenty papers dealing explicitly with the introduction of 
| S0vernment activities into national accounts and nearly half of the general 
Pers on concept treat the problem. There are doubtless many ways of ex- 
| ‘Plaining the apparent concentration of analysts in the yeats 1948-49 on the 
— Problems involved in fitting government“ activities into the structure of 


398 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


national accounts and answering Kuznets' questions in the preface to the 
Bibliography, Volume I, “Why do ‘official’ estimators in industrially devel- 
oped countries in recent years adhere so closely to the treatment of all gov- 
ernment purchases as final product, whereas unofficial estimators do not 
easily accept the underlying assumption? Does the former treatment permit 
greater ease in planning and budgeting governmental activities and tracing 
their impact on the short-term economic situation of the country?” 

Data in the form of government accounts do exist in most countries and 
it is not surprising to find the empirical workers concentrating on the prob- 
lems of extension and adjustment of concepts to this sector of economic activ- 
ities, The answers to Kuznets’ two questions will be given by experience with 
statistical analysis of the national estimates that have been accumulating in 
volume sufficient to warrant this Bibliography to cover the output of two 
years, 

The estimates of national product, national income, and other forms of 
social accounts have become exceedingly important in national policy, plan- 
ning and forecasting. The tools of the statistician may not, for some time to 
come, be of much use in the primary processes of data collation and estima- 
tion but in the processes of analysis the statistician has much to offer. The 
real refinements of the concepts in national accounts can be expected to come 
as experience with statistical analysis of the existing estimates, however in- 
complete, inaccurate, and inconsistent they may be, reveals the structure of 
definitions that is practically most useful for the purposes the estimates are 
intended to serve. ' 

The relative virtues and values of the sundry concepts for prognosis, fore- 
casting and general economic studies will probably be decided by the results 
of statistical analysis long’ before the scholars can agree on the identification 
of the concepts specifying the measurements with theoretical abstractions. 
The accumulation of observations and methods of empirical analysis may 
soon render some of the controversial issues obsolete, even meaningless. 
The stress, for example, on national income or some other aggregate as an 
absolute measure of “welfare,” a reallyvague abstraction, in the formulation 
of concepts will doubtless disappear as experience with analysis is extended 
in time and over the nations of the world. Analysis to date, it is true, сап 
produce little that merits such optimism about the progress of empirical 
knowledge of the numerous economic magnitudes that fill in the national 
accounts. In particular, the much debated “consumption-saving function,” 
which accounts for at least 10 papers listed in the Bibliography, Volume Il, 
е have produced any promising new tools of analysis. B 
and com BA at Mer E ET ооо Ча УН 
Ba eut Ex in national wealth and social accounts has already stimula 

d ne series analysis and surely will lead to sóme new develop- 
ibd em rein in economie analysis of observations on geograpl L 
than 70 codice UM the Bibliography for 1948-49 cover mo. 

т colonies for varying dates or anges of dates. Surely this 


899 


ficulties imposed by variability in concept and in the precision of estimates 
d devise the means of analysis. Empirical science thrives on the continuous 
vision of theoretical concepts imposed by experimental findings and the 
dy refinement of methods of measurement and analysis that result from 
lese revisions. 


е Role of Federal Crédit Aids in Residential Construction. Leo Grebler. New 
ork: National Bureau of Economic Research, Inc., 1953. Occasional Paper No. 
| Pp. 76. $1.00. Paper. е 


SuznMAN J. MarsEL, University of California (Berkeley) 


pius paper is a concise review of the direct impact of the Federal govern- 
ment on residential financing. Data have been gathered from a wide 
yariey of sources, They have been well organized so as to give to the reader 
lear picture of the extent of Federal intervention in the real estate financ- 
eld. By judicious selection and analysis, the author clearly illuminates 
e possible results of these governmental policies. While the analysis is clear, 
k of data has made it impossible to appraise fully most of the hypotheses 
Concerning effects that are presented. This lack of data, together with the 
| amount of piecing together of disparate statistics required for this 
dy, is a clear indication of the failure of the Federal housing agencies to 
ect and publish adequate statistics required for an accurate evaluation 
Of their programs and policies. ч j 
The paper points out that under the programs of the Federal Housing 
inistration and Veterans Administration, the government in recent years 
insured or guaranteed between 40 and 50 per Cent of loans made on new 
Housing. As a result of these governmental policies, the level of house con- 
Biruetion was probably somewhat higher than it would otherwise have been; 
market for houses was almost certainly widened; and the costs of owner- 
ip may have been lowered. Although primarily as by-product results of 
tions taken in response to other needs, there have been important impacts 
ч ш type of houses built and on institutional practices in the lending 
markets, ; 
р Many readers will find that the value of this review of policy is increased 
AY à supplemental statement of the author as to the conclusions concerning 
future that he draws from the past trends. By adding such a statement, 
author obviously opens himself to attack from those who disagree with 
Yeading of history. This reviewer feels, however, that the clear statement 
future policy problems which may arise from the action taken to date greatly 
Dances the value of this paper. Providing, as in this case, sufficient detail 
en to make “disagreement possible, a consideration of future problems 
Хал author completing a detailed analysis of the past is а valuable addition 
Do the study, whether or not one арїеев with the conclusions. 
в 


400 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


Retail Prices and the Consumer Preference. Studies in Business and Economics, 
Volume 6, Number 1. University of Maryland, College Park, Md.: Bureau of 
Business and Economic Research, 1952. Pp. 8. 


SornrA Gogex, Standard Oil Company (N. J.) 


mIs pamphlet contains three brief papers: “Price Variations in Men's 

Clothing by Kind and Location of Store,” “Retail Prices, Income and 
Race,” and “Consumer Patronage of Independents and Chains in 59 Cities,” 
each consisting primarily of a summary of the data analyzed. The first com- 
pares the prices at neighborhood and centrally located specialty and depart- 
ment stores within and among eighteen cities of eleven items of men’s cloth- 
ing, The second investigates the relative prices in one negro and three white 
(low, medium, and high rental) neighborhoods in Chicago of a selection of 
dry goods, men’s clothing, and women’s clothing. The third summarizes 
consumer preferences, in shopping for food and for drugs, as between inde- 
pendent and chain stores, and as between neighborhood and centrally located 
outlets, 

Analyses of this type would be both more interesting and more useful if 
there were more discussion of statistical problems and of the economic and 
cultural factors responsible for the behavior of the data as recorded. (The 
second paper, which does go into these matters to some extent, is much the 
best of the three.) A listing of the sources of data, where they are apparently 
secondary, would also be helpful, 


Industrial Specifications. E. H. MacNiece. New York: John Wiley and Sons, 
1958. Pp. xiii, 158. $4,50. 


J. Н. Qurtiss, New York University 


nis little book gives a description of several different types of written in- 

dustrial specifications now in use. The classes discussed are raw-material 
specifications, process specifications, product specifications, and purchase 
specifications promulgated by public agencies. The exposition is illustrated 
rather extensively by examples. x 

The treatment is generally descriptive rather than analytical, and accepts 
current practices without criticism, constructive or otherwise. The sugges- 
tions as to the techniques of preparing specifications are largely confined to 
listing the topies which should be covered and to urging that clear language 
be employed, 

This serves a useful purpose, and the book should be an influence toward 
better specifications. On the other hand, the deeper problems of specification- 
writing are dealt with only briefly and qualitatively, The problems which the 
reviewer hasin mind are exemplified by these: Should acceptance sampling Te- 
quirements be included in the written specification along with design require- 
ments or should they be issued separately? Should design requirements ever 
be stated in term’ of the behavior of random samples? (In several of Mac- 
Niece’s examples, this does occur. How should laboratory research data be 


В R ? 


—.———————— Cl 


BOOK REVIEWS 401 


processed so as to derive valid tolerance limits? Are inspection data better 
than laboratory data for this purpose? And so forth. MacNiece handles prob- 
lems of this sort largely by generalities, such as “Quality control and indus- 
trial specifications are blood brothers, and each depends on the other." 

But the choice of subject matter is always an author’s prerogative, The 
fact that this little volume is non-technical makes it easy to read, and Mac- 
Niece writes fluently and well. Quite apart from its intended use, the book 
should provide some useful case-history material for teachers of quality 
control and industrial management. 


Analysis in Dental Research. Neal W. Chilton. Washington, D. C.: Office Tech- К 
nical Services, U. 8. Department of Commerce, 1953. Pp. 216. 


GzorrnEY BEALL, University of Connecticut 


is introductory text on statistics is devoted entirely to data obtained in 
dental studies, One’s first impression is to regret that not only must we 
accept the modern departmentalization of knowledge but must suppose 
modern man incapable of transferring mental operations from one field of 
fact to another, On second thought, the dental student, research worker, and 
inician will indeed enjoy a book where statistics is so richly illustrated in 
their terms. The book bears the marks of having been tried and proven in 
class; the discussion is pleasant and reasonable. 

The general treatment shows the mark of being unduly influenced by the 
extensive and varied literature of texts introducing statistics. Thus, in con- 
hection with the calculation of a standard deviation from a sample, the 
author follows the very old and curious procedure of recommending a biased 
estimate, having divisor M, when N is greater than 30. On the other hand, 
bd in connection with concepts of chance variation and the binomial dis- 
tribution he is influenced by the vogue of the moment which would circum- 
вое Statistics to an elaboration of combinatorial probability. This con- 
fusion of position does not help general organization. Thus correlation is pre- 
‘ented in an infelicitous manner involving the estimation of standard devia- 
tions and means, It should, in the present context, be reduced to a form in- 
volving only sums of various kinds. An unhappy mixture occurs in the dis- 
cussion on frequency graphs together with the general nature of numbers and 
the presentation of time series. 

. Various particular faults may be found. Thus the discussion of the mode 
Confusing since the term is not used in the ordinary statistical sense but 
here means the most commoreclass. The author gives a lengthy and involved 
discussion of “normal curve analysis of the four-fold table” without appar- 
ently realizing that he has calculated a value of chi (not squared). He then 
8068 on to presentation of the contingency for the 2X2 table to get chi- 
Squared. In neither case does he include Yates’ correction for discontinuity. 

In Spite of faults the book still haf practical virtues. It concerns itself first 

with the analysis of qualitative data and then with quantitative data as a less 


а * 


tribution but considers tests of various kinds on a simple normal ba 
though he devotes a terminal chapter to a very satisfactory discussion 
t-test and analysis of variance. A 


Intrastate Migration in Michigan: 1935-1940. Amos H. Hawley. Michigan í 
ernmental Studies No. 25: Ann Arbor: University of Michigan Press, 1958, 


199. $1.50. Paper. Ei à 
Jonn Forger, Southern Regional Education Board 4 


"pes is a book of facts about migrants in Michigan. As an example of pu 
empiricism it is outstanding, description hardly ever giving way to i 
alysis or interpretation. A good index of the descriptive nature of the we 
is the fact that it contains 106 tables, 28 figures, and only one referent 
any other study in the field of migration. 


mendous amount of detailed data contained in the subregional tabulatio 
of migration to understandable proportions, But the study never goes be 
the initial framework provided in the Census data. No interpretation is giv 
either by comparison with migration in other times or places or in terms 
conditions or changes in the subregion of origin or destination. The stud 
of population seeking new theories of migration or new tests of old hypo 
ses will be disappointed. The facts are there, and all of the pleasure and! 
work of assessing their meaning is left to the reader. 

The measures of migration used are all easily understood, propor 
and simple rates being employed throughout most of the book. Greater 
more refined rates and indirect standardization techniques would have е6 
valuable in removing the effects of differences in characteristics between m 
grants and nonmigrants when studying the differences in, other charac 
ties. For example, how much of the difference between migrants and 
migrants in marital status is attributable to age differences between 


It is this Teviewer's hope that future studies of migration utilizing the 
detail of the subregional tabulations will be able to perform both the con 
tent summarization of the data that Hawley has provided, and a system 
testing of hypotheses about migration and the characteristics of migrants. — 


` 
Sample Surveys of Current Interest. (Fourth Re isti 

port.) Statistical Papers, 
C, No. 5. New York: Department of Economic Affairs, Statistical Office 


United Nations, March 1952. Pp. 56. Paper. 50 cents. ; 
: Hanorp NissELSoN, Bureau of the Census 

Tm Report is the fourth and latest in a more or less annual series 

tin 1948 at the suggestion óf the Sub-Commissidn on Statistical Samp 


BOOK REVIEWS 403 


of the United Nations Statistical Commission, “to assist those interested in 
the application of modern sampling techniques by making available, in sum- 
mary form, the experiences gained by various statisticians and statistical 
organizations in the field.” Altogether, 33 of the 60 U.N. countries are repre- 
sented in one way or another in the four Reports published to date. Perhaps 
not unexpectedly, there have been no reports from Russia nor from any of 
the “Iron Curtain” countries except Czechoslovakia (twice), Hungary 
(twice) and Poland (once). The projects reported are presumably not to be 
taken as a proper sample of all sample surveys being carried out in the 60 
U.N. countries. However, the general impression left by the Reports is that 
the ideas and techniques of modern probability sampling have achieved a 
rapid and wide circulation, both geographically and in subject matter dealt 
with. On the other hand, hardly any reference is made to non-sampling errors 
in surveys. It is clear that elsewhere, as in the U.S.A., the general theory of 
survey design—in contrast to its sampling phases—is in a relatively unde- 
veloped state. 

The information for these Reports is obtained by circularizing the national 
statistical offices of all U.N. countries (in the U.S., the Division of Statistical 
Standards, Bureau of the Budget, Executive Office of the President), with 
dependence on the cooperation of those offices for replies. The reports, there 
fore, have three characteristics, First, they appear to include only work of 
official and quasi-official (e.g., The Indian Statistical Institute) organiza- 
tions. Second, not all current projects are reported, as is explicitly noted for 
India, the U.K., and the U.S.A., and the selections may be considered some- 
what arbitrary. Third, the project descriptions do not completely adhere to 
the recommendations of the Sub-Commission itself.1-Information such as 
Variances of sampling units or unit costs is scanty; and the summaries tend 
to be long on description and short on evaluation. (The U.S.A. is not out- 
standing in these‘respects.) Statisticians interested in technical details of 
the design of a given project, or in a realistic evaluation of the experience in 
carrying the project out, will therefore generally need to write to the United 
Nations Statistical Office or the sponsoring organization for further informa- 
tion, Nevertheless, the Reports do provide a compilation in a single place of 
fome of the more important official sampling projects being carried on in 
Various countries, many of which have not been published as papers on 
sampling, For example, the present Report contains the only published stato- 
ment currently available on the applications of sampling in the enumeration 
and processing of the 1950 Censuses of Population and Housing. For this 
Purpose the Reports are mare comprehensive, and should be more useful, 
than specialized listings presently appearing in scattered jdùrnals. It pre- 
sumably lies with statisticians themselves to decide how useful this series 
E and, by, their cooperation to remedy any short-comings in the Re- 


1 "The Preparation of Sampling Survey Reports,” Statistical Papers, Вейев C, No. 1, Statistical 
of the United Nations. . 


——— 


404 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


Age and Achievement. Harvey C. Lehman. Princeton, N. J.: Princeton Univer- 
sity Press, 1953. Pp. xiii, 359. $7.50. 


Leona E. Tyrer, University of Oregon 


оо often in psychology research workers have made generalizations that 
Mekai far beyond the evidence their data present. This book is a striking 
exception. Its 331 pages with their 170 graphs and 61 tables are all related in 
some way to the important question, “What are man’s most creative years?” 

What the author has done is to analyze facts reported in standard reference 
books with regard to the ages at which great men have made their outstand- 
ing creative contributions. Science, philosophy, smusie, art, and literature 
have all come under his scrutiny. The result is a large number of curves, each 
of which shows the average number of important contributions for each five- 
year age period in the area of human achievement to which it applies. All of 
these curves are placed on the same scale by the simple device of reducing the 
figures to percentages. Whatever its absolute value, the highest number, 
representing the peak age interval, becomes 100 per cent and the others 
take lesser values accordingly. Other statistical information is given in ex- 
tensive tables so that the reader can quite easily locate numbers of cases, 
medians, means, and standard deviations for the various distributions. The 
discussion emphasizes the modal values, the peaks of the curves, more than 
anything else, but the other figures are not ignored. The derivation of basic 
data from biographical dictionaries insures freedom from bias since such 
figures were compiled by men who were not concerned at all with this par- 
ticular question. 

In addition to his discussions of the central problem, the author includes 
somechapters on supplementary issues, such as the relationship between 
early achievement and total output, and comparisons of age trends in earlier 
and later periods of history. » : 

Several conclusions are drawn from the data presented. First of all, the 
vast majority of the curves show a remarkable similarity. The peak occurs 
early, most commonly in the 30's, but as low as the 26-30 interval for chem- 
istry and as high as the'40-44 interva? for light opera and musical comedy, 
for “best books," and for metaphysics. All curves show a gradual decline, 
with some achievement indicated even into advanced ages. It is perhaps note- 
worthy that the data for sports and for such activities as chess show the same 
trends as the others. Championships are won most frequently by persons 
under 36. 

Secondly, when the works agreed by most critics to be of the very highest 
quality are sifigled out for special attention, the curves drawn from these 
figures show earlier peaks and a more rapid descent than do those for the 
less selective distributions. This finding is stressed again and again. The more 
quality as distinguished from quantity is considered, the younger is the avet- 
age age at which it is achieved. E 

Thirdly, data for incomes and for positions of leadership produce curves 


. 9 ə 


BOOK REVIEWS 405 


which are quite different from those for creative achievement. Their peaks 
occur most commonly at ages of 50 and higher. 

The author does not present any one type of explanation for these findings. 
He reminds us periodically that these are not biological curves representing 
the rise and decline of vigor or intelligence in any one individual, although 
if there were such developmental trends they would reflect themselves in 
this way. The extensive documentation in Chapters 13 and 14 with regard 
to outstanding contributions made at very early and at very advanced ages, 
as well as the evidence.that trends have changed somewhat from one histori- 
cal period to another, argue against a purely biological explanation. There 
are a number of other possible causes centering around motivation and 
around the circumstances in which great men typically find themselves at 


different periods of life. These are listed in detail in the last chapter, 


It is possible to criticize Lehman’s report on two counts, one rather super- 
ficial, the other less so. First, it would have made better reading if the author 
had worked over and reorganized his material instead of simply combining 
into a book chapters that had already been published as journal papers. 
There is a great deal of repetition from chapter to chapter, as things stand, 
and the same points are made again and again. Second, the statistical work 
is confined entirely to descriptive statistics. Our judgments must be based on 
a scrutiny of the figures and tables, without benefit of significance tests of 
any kind. So far as the main question is concerned, this is probably satis- 
factory, since the position of the “peaks” is easily determined by inspection, 
But other questions would seem to call for tests of the statistical significance 
of differences. How similar are various curves in shape? Is there really a 
tendency toward bimodality as many of the figures in the first few chapters 
would suggest? Is the change from an earlier to a later period, as discussed 
in Chapter 18, significant? It is unfortunate that Lehman’s interest in the 
shape of the age curve has in some instances led him to give us data to which 
significance tests cannot very well be applied. For instance, in order to show 
peaks clearly, five-year intervals were selected in different ways for different 
bodies of data, Thus for German lyrics and ballads, the intervals are 17-21, 
22-26, 27-31, etc. instead of the 20-24, 25-29, 36-84, etc. which are used in 
most of the tabulations. Distributions set up in this way with intervals 
chosen after inspection would not be statistically comparable. Another in- 
stance occurs in Chapter 18 where the conclusions about historical periods 
would be strengthened by significance tests. What Lehman has done in each 
of his groups is to compare curves for the 50 per cent born earlier and the 50 
Per cent born later. The result is that the actual historical period represented 
varies from one comparison to the next. For example, a geologist born in 
1800 is included in the “early” group, whereas a mathematician born in the 
same year appears in the "later" group. Thus different groups of scientists 
could not very well be combined. 

ese criticisms, it is true, apply to only a fraction of the data presented. 
ther workers wishing to test various hypotheses about age trends in achieve- 


> ° 


406 $ AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1954 


ment will find this book a mine of information. Much use can probably be 
made of means and standard deviations as given in the various tables. While 
we can wish that the author had gone further in using this large body of data 
to enlighten us as to why these trends appear, we can at the same time be 
grateful to him for digging out the figures which show the trends themselyes 
so plainly. 


Backgrounds of Human Fertility in Puerto Rico: A Sociological Survey. Paul К. 
Hatt. Princeton: Princeton University Press, 1952. Pp. xxiv, 512. $5.00. Paper, 


Henry 8. Өнвтоск, JR., Bureau of the Census 


ul ee book is the first and most general report of a field survey conducted 
in 1947 under the auspices of the Office of Population Research, Princeton 
University, and the Social Science Research Center, The University of 
Puerto Rico. The study was designed to “throw light upon those basic atti- 
tude patterns and life conditions which affect fertility levels in Puerto Rico,” 
Its conceptual design has much in common with the 1941 Indianapolis 
Study of Social and Psychological Factors Affecting Fertility. 

The probability sample used by the Insular Bureau of Labor Statistics in 
its monthly survey of the labor force provided a prelisting from which the 
present sample was drawn. Of all listed households, 92 per cent were located 
and at least one adult interviewed. Within these households 92 per cent of 
the adults were interviewed so that about 85 per cent of the adults covered 
by the original list were probably interviewed. The actual sample was 
checked for representativeness against the 1940 Census and by comparing 
households missed with neighboring households. There were moderate biases 
in the direction of the exclusion of youths and the economic extremes. 
(Data from the 1950 Cen&us, which have subsequently become available, 
indicate somewhat less bias in the sample with respect to the sex and age of 
adults.) 

The supervisors of the field work were social science graduates. The 65 ex- 
perienced interviewers, all of them women, were given two days’ training 
plus a conference after twa practice interviews. The schedule was pretested 
upon Puerto Rican women living in New York. The final forms, as well as 
the instructions to the enumerators, are reproduced in appendices, although 
only in the original Spanish, This rather long and intensive schedule con- 
sisted of several parts: form 1 applying to each household, form 2 to each 
poreon. 15 years old and over, and forms 3 and 4 to each person who had had 
any kind of marital experience (legal, consensual, or concubinal). On the 
other hand, the 1950 Census question on children ever born was asked of all 
women, including the single; and the over-all nonresponse rate was quite low. 

After an initial chapter on “Methodology,” the next four chapters are 
entitled “The General Patterns of Social Conditions and Social Attitudes,” 

Socio-economic Status: Rental Value „and Educational Level,” “Rural- 


> 


> Я 


OK REVIEWS 407 


an: Birth and Residence,” and “Age Differences,” respectively. These 
pters treat the relationships among various background factors but do 
deal with measures of actual fertility. The relationships are examined by 
achoric correlation, with partial and multiple as well as zero order coeffi- 
being presented. In many instances there would seem to have been 
ugh cases to permit further cross-classification, which would have been 
e satisfactory than holding a factor “constant” through partial correla- 
he factors brought into this correlation analysis include attitudes toward 
number of social questions, including, among others, the desirability of con- 
sual marriage for both men and women, ideal age at marriage, and ideal 
of completed family. The attitude measures themselves are examined in 
eral interesting ways (for example, ideal age at marriage is compared with 
n actual age at marriage and number of children desired for daughter is 
npared with the woman's own actual fertility). The attitude questions and 
construction of the measures based on them seem relatively primitive 
after the reports of the Research Branch, Information and Edueation Divi- 
sion, War Department, although it must be allowed that little had been pub- 
ed by Stouffer and his associates on their methodology by the time the 
present study was designed. (The reviewer suspects that the rather disap- 
pointing findings of the Indianapolis Study concerning the effects of psy- 
logical factors on fertility would be replaced by much more positive re- 
sults could this new apparatus of attitude research be employed.) 
Tetrachoric correlation is also employed extensively in the chapter on 
ifferential Fertility," which contains much of the meat of the study. It 
unfortunate that, for mechanical considerations of processing the data, 
ty is expressed in terms of pregnancies rather thán of live births, since 
latter were probably reported more accurately. Robert Osborn, Jt., has 
tributed a long chapter on “The Trend in Fertility,” using mostly cohort 
alysis of the survey data but also pertinent census data and vital statistics. 
e downward trend that he is trying to measure seems to be a recent and 


үе answers, however. 

Th conclusion, this book represents a sound job by the late Professor Hatt 
nd his associates, Their findings are interesting and significant. Insular 
titudes toward such things as family limitation, ideal size of family, ideal 
è at marriage, education wanted for children, and the employment of 


changing attitudes can be translated by the oncoming generation into 
k 5 n 


e 
e B 


408 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 194 


actual conduct, there would be a very substantial drop in the high fertility 
levels now prevalent. From the standpoint of statistical methodology, this 
study contains few innovations; but the report does demonstrate that a 
fairly complicated kind of sample survey can be successfully carried out in 
this kind of underdeveloped area. 


An Approach to Measuring Results in Social Work. David G. French. New York: 
Columbia University Press, 1952. Pp. xiv, 178. $3.00. 


Joun E. Wausu, U. S. Naval Ordnance Test Station 


TR book is a report on the 1950 Michigan Recònnaissance Study of Eval- 
uative Research in Social Work, which was sponsored by the Michigan 
Welfare League and financially supported by the James Foster Foundation. 
“Evaluative research” refers to research performed to evaluate a practice 
or а policy. The book is not itself a piece of evaluative research but rather a 
survey of the possibilities in applying evaluative research to social work. 
Although carried out with reference to social work in Michigan, the analysis 
and recommendations are believed to have general applicability. The pres- 
entation is both logical and lucid. 

Three general procedures were used in performing the Reconnaissance 
Study. First, factual information which appeared to have application to the 
planning of a research program in social work was assembled for the field of 
research and for the field of social work. Second, the available literature 
dealing with the general field of applied research, with emphasis on social 
science research and social work, was reviewed. Finally, both individual and 
group conferences were held to supplement the available printed information. 
These conferences were used to obtain suggestions and observations as well 
as to test portions of the analysis and conclusions developed during the study. 
As the study progressed, the emphasis shifted from evaluation to research. 
This was due to the great difficulties involved in performing valid and useful 
research in the social work field. However, the evaluation aspect was retained 
as a meeting place for the practical concerns of the social worker and the more 
theoretical attitude of thé Tesearcher. ? 

An outline of the content of the seven chapters of the book furnishes 8 
good indication of the approach used in the Reconnaissance Study. Chapter 
I describes the general motivation which led to the study. Chapter II con- 
siders the situation in Michigan and emphasizes the large financial invest- 
ment in social work (approximately $125,000,000 in Michigan for 1950). 
Chapter III states some general types of questions which professional and 
lay persons тйзе concerning social work and studies these questions from 
the Viewpoint of planning a research program, Chapter IV analyzes what is 
involved in doing evaluative research. Four representative evaluation studies 
in social work were reviewed to determine their strong and weak points. 
Chapter V considers the relationship between theoretical and applied re 


» 


BOOK REVIEWS 409 


search as a basis for planning a research program in social work. Chapter VI 
states eight criteria which are believed to be important for any continuing 
program of research in social work and outlines an administrative setting 
which appears capable of satisfying all the criteria. Chapter VII extends the 
general considerations of the preceding chapter into some moderately specific 
suggestions concerning the establishment of an institute for research in social 
work. 

The four methodology analyses of evaluation studies considered in Chap- 
ter IV are themselves of interest. Only the study title, the authors, and the 
person making the analysis are stated here: (1) Measuring Results in Social 
Casework: A Manual on Judging Movement,” by J. McVicker Hunt and 
Leonard 8. Kogan with analysis by John С. Hill. (2) “Changing Attitudes 
Through Social Contact," by Leon Festinger and Harold Kelley with analy- 


‘sis by Leon Festinger. (3) “An Experiment in the Prevention of Delin- 


quency,” by Edwin Powers and Helen L. Witmer with analysis by Helen L, 
Witmer. (4) “Unraveling Juvenile Delinquency,” by Sheldon and Eleanor 
Glueck with analysis by Alfred J. Kahn. 

The principal recommendation of the book is that an appropriate location 
for research in social work is as an institute which is part of a graduate pro- 
fessional school of social work in a university. This recommendation is moti- 
vated by the criteria of Chapter VI and seems to be both reasonable and fea- 
sible. The specific suggestions and financial estimates given in Chapter VIT 
for the establishment and staffing of such an institute for research in social 
work also appear to be reasonable. 

This book represents a well-thought-out first approach to the problem of 
obtaining valid quantitative information in the important practical field of 
social work, The basic difficulties are clearly recognized and no attempt is 
made to underemphasize their magnitude. Slow progress is anticipated in 
attaining even méderate objectives. ° 

Evidently the development of research practices and theories which have 
practical value for social work would require a large amount of time, effort, 
and money. However, the great scope and importance of social work would 
seem to indicate that the eventual gains from aswell-founded evaluative re- 
search program would far offset the investment involved. This has been the 
case in many other practical fields and there is no indication that the social 
work field represents an exception. 


ә 
^ 


RANDOM DIGITS (15,126-17,375) 
From А Million Random Digits, to be published by the Rand Corporation, Santa Monica, California, 


33591 59785 12833 98932 68064 
58418 90331 55858 04015 21454 
64446 51017 22280 75597 50227 
72186 00303 38880 93327 49522: 
98626 82484 54610 507211 78610 
58393 20225 05436 46172 88951 
37346 51007 38032 36002 21080 
70712 44236 96795 92351 92844 
93585 09918 30983 44282 66849 
09473 72923 16747 49802 50639 
40229 34921 60405 06803 19332 
39795 77221 10012 40798 33864 
10288 57483 10881 58984 45136 
75702 69428 34047 76224 45887 
18129 93659 58389 19715 66259 
17777 41004 47057 30688 07539 
75195 62294 03371 11672 13089 
09722 67635 12114 63055 09214 
78800 86912 . 42076 50287 97998 
94287 54751 36242 36557 85604 

Д 
5997 30761 97081 09501 68887 
8241 30402 ° 12318 52430 40139 
54382 73370 26184 14024 : 57444 
77681 74946 02099 69091 19372 
53148 26074 52293 65359 63971 
27212 80889 ^ 46933 13364 33883 
03867 03105. 87912 29610 75108 
79895 82633 19209 21548 35022 
55256 69386 57453 70147 73538 
75937 31113 07607 48037 71020 
83389 80236 65972 74528 40888 
37363 30345 79933 71058 34826 
21960 95585 40374 13239 56162 
49562 44137 46625 > 20031 08524 
22000 = 97414 30980 74485 26480 
55386 95918 92481 49234 62616 
uu 69513 36950 63526 > 93824 
Ds 95851 90956 64494 95979 
200 » 65817 99523». » 73180 59978 

30031 ‚ 18840 99260 21284 


410 


| 


RANDOM DIGITS 


54021 
71820 
17709 
08254 
81351 


30665 
09069 
50593 
49578 
58388 


36995 
66348 
45457 
35406 
09678 


31604 
05523 
09458 
74086 
89299 


78089 
62067 
18924 
78656 
68420 


39182 
26224 
15737 
51093 
67799 


23875 
13701 
87741 
42479 
76691 


86707 
92564 
94648 
29838 
36368 


09254 
41486 
90794 
58558 
52392 


46095 œ 
09925 
70550 
50281 
22767 


29008 
11038 
94849 
78963 
70961 


43699 
78653 
14698 
18100 
40915 


36202 
87666 
78252 
59553 
24538 


55850 
34355 
29207 
24077 
30765 


36887 
41248 
44409 
18550 
16483 


51174 
19972 
00496 
62290 
28342 


56766 
95168 
86434 
47405 
33263 


18895 
39987 
05598 
10664 
17792 


07510 
58524 
51934" 
84833 
53546 


19501 
96974 
42483. 
83708 
26514 


83672 
20183 
31771 
95437 
10255 


03593 
90094 
04737 
59836 
94507 


07971 
78055 
98239 
57852 
52426 


25644 
75127 
43632 
21369 
00348 


65867 
78667 
45345 
37171 
26482 


98146 
13442 
57764 
38948 
14551 


01932 
13169 
22400 
14055 
62048 


81790 
02283 
32171 
28050 
84792 


51039 
54508 
03295 
17105 
70291 


56181 
34321 


« 71204 
27298, 


56322 


96395 


411 


29498 ` 


81585 
59108 
84576 


59990 
82876 
36417 
91696 
11548 


62062 
82955 
75563 
00009 


83781 


62275 
71419 
38513 
75329 
71581 


47390 
05622 
52794 
56058 
71336 


06606 
76755 
12835 
16618 
29854 


31883 
75609 
99851 
34464 
97697 


74832 
28776 
36746 
13409 
74110 


84500 
58504 
16300 
25003 
67315 


05023 
12862 
40642 
95086 
63731 


UNIVERSITY 
PRESS 


SS J——— 
COHORT FERTILITY e Native White 


Women in the United States . 
by P. K. Whelpton i 


The annual fertility experience from 1915 to 1950 is tabulated 
and analyzed for the white women born in each of the years from 
1880 to 1933. This book develops new techniques of measuring 
past fertility trends and provides an improved basis for forecasting 
future trends. Sponsored by the Scripps Foundation. 

516 PAGES, $6 


A THEORY OF ECONOMIC-DEMOGRAPHIC 
DEVELOPMENT * by Harvey Leibenstein, 
foreword by Frank Notestein 


How can the overpopulated, underdeveloped areas of the world 
break out of the Malthusian trap? This is the first book to deal with 
the economic aspects of this vital question. No other book yet pub- 
lished attempts to integrate capital accumulation and population 
Browth into a single scheme. 


° * 222 PAGES. $4 


COLONIAL DEVELOPMENT AND POPULATION 
IN TAIWAN * by George W., Barclay, 
foreword by Ffank N, otéstein 


Offers an unusual and authoritative view of an agrarian region in 
the process of development by a colonial power. Based on the re- 
markable and unprecedented statistical data which the Japanese 
compiled to aid their administration—one of the most complete 
and cteditable records for a population óf this size that has evet 
been at the disposal of demographers. 


296 PAGES. 78 TABLES, 32 FIGURES. $5 


Order from your bookstore ө PRINCETON UNIVERSITY PRESS 
Please mention the Journal of the Aixeascan Statisticar Assocation in writing advertisers 
e ә 


JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Number 267 SEPTEMBER 1954 Volume 49 
LI 


A QUARTERLY MODEL FOR THE 
UNITED STATES ECONOM Y* 


HAROLD BARGER 
Columbia University 
and 
Lawrence R. KLEIN 
University of Michigan 


ANY theories about economic behavior imply a belief that it can 
be represented by some system of equations whose solution is 
determinate. The econometric problem is to specify the form of the 
functions and to estimate the parameters. It may be that there exist 
more than one system adequate for the purpose; it, may also be that, 
for an entire economy, even the simplest such system is too complex 
to be estimated from any known body of data. This paper describes an 
attempt to represent quarterly movements of gross national product 
in the U. S. economy by а model with as few as three equations. Failure 
in the attempt may suggest that a more complex model is required. 
Most empirical econometric studies have been based on annual time 
series data,! the sample size seldom exceeding twenty to thirty ob- 
servations, It has frequently been suggested that econometric research 
Would benefit from the use of quarterly data. The sample would be 
enlarged by a factor of four and more detailed information obtained 
about movements of the economy. Although we do not expect to get 
four times as much information by shifting from annual to quarterly 
data, we do expect to add something to our knowledge of underlying 
economic structure. For instance lags can be measured more accurately. 


Thin Thia Paper descrikes some results of an investigation made possible by grants from the Columbia 
both ару Council for Research in the Social Sciences and the Social Science Research Council, to 
Versity whom we are grateful. We are also indebted to Sylvia Schlachter, formerly of Columbia Uni- 
ү and now of the University of Michigan, who carried out the computations. 
1021 590, however, Colin Clark, “A System of Equations Explaining the United States Trade Cycle, 
9 1941,” Econometrica, 17 (1949), pp. 93-124, where use is made of quarterly data. 


o e ‹ 


418 


414 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER юм 


On the other hand we face new problems such as seasonal variation and 
increased serial correlation of disturbances. 

In this paper we shall discuss two simple quarterly models and esti- 
mate their parameters from U. S. data for the interwar period. We 
shall then test the models by extrapolating the results into the period 
since World War II. 


CHOICE OF A MODEL 


Our starting point is the three-equation .model already fitted by 
Klein to annual data.* This choice was made so that we would have a 
a good basis for comparison of our results with those of an annual 
model, The three-equation model is compact and permits considerable 
experimentation. Of the following variables, all of which are measured 
in constant prices, the first six are regarded as endogenous and the 
remaining as exogenous: 


C consumer expenditures 
W, wages and salaries paid by private industry 
П non-wage income or “profits” 
I net private domestic investment 
K  year-end stock of capital 
Y net national income 
W wages and salaries paid by government 
Т? indirect taxes less subsidies 
E G government purchases plus net foreign balance 
t time К 
u random disturbance 


In the annual model, equations for which are as follows, negative sub- 


scripts outside parentheses denote variables lagged by the number of 
years indicated. 


(1.1) C = œ + on(Wi + We) + о + os(ID) 4 + ш 


(1.2) I = Bo + Bill + 601), + AK). a + us 

(L3) = ptn T — Wy + yY +T- Р) d ytt 
(G5 YtP=C+I+¢ z 

(1.5) Y-n-cW;-dW, 

(1.6) I-K-(K) 


E * 
? Lawrence R, Klein, Economic FL MEN M adim 
1950), pp. 55-80. ic Fluctuations in the United States, 1921-1941 (New York: 


» a fe 


A QUARTERLY MODEL FOR THE UNITED STATES ECONOMY 415 


Of the three stochastic equations, (1.3) can be regarded either as dis- . 
tributing income between wages and profits (including interest and 
rent), or as the demand for labor. If we prefer the latter viewpoint, a 
direct measure of private output is required; therefore we replace 
(Y--T—W:;) by (C--I--G—W;). The last three equations, being 
merely accounting identities, have known coefficients and are not sub- 
ject to random disturbances in behavior. Equation (1.4), however, is 
subject to errors of.observation insofar as a statistical discrepancy 
exists between direct estimates of national product (expenditure 
version) and estimates*of national income (factor payments version) 
converted to market prices. 

To adapt the above model so that the variables relate to quarters 
instead of years, a one-year lag needs to be replaced by a lag distributed 
over several quarters. Yet if lagged variables are introduced too freely 
into a linear scheme, intercorrelation between them may render results 
indeterminate. Therefore we grouped lagged variables in pairs (from 
this point forward, negative subseripts outside parentheses will denote 
values lagged by the number of quarters indicated): 


(х)— + (x)- on б 
2 


With the help of this convention we might (for instance) write the fol- 
lowing quarterly version of (1.1) to (1.6), the time unit referred to 
being three months instead of a year. X 


(231) C= a+ (Wi + We) + aM + оа) + ш, 

(2.2) I = By -B:1TH- Bs (IT). 574- Bs (IT) 1/2 + B8 (LI) -72- B8 (C) ado, 
(23) Wi = тоу (C4-12-G— Юа) Jv (C 2-2 +G- W3) sab vtt o, 
` together with (1.4), (1.5), and (1.5) above. <= 


(2)-en4/2 = 


` 


THE ESTIMATION OF PARAMETERS 


The equations (2.1), (2.2), and (2.3) each contain at least two 
endogenous variables. We could, of course, estimate the coefficients 
in each equation by conventional use of least squares. To do so we 
should have arbitrarily tò treat some particular variable as dependent, 
and all others (endogenous or otherwise) as independent. This procedure 
8 arbitrary and, moreover, is known to lead to biased estimates of the 


? Were it not for the statistical discrepancy in our national accounts, we would have the same 
ee of output reckoned as a sum of expesditures or as а sum of factor income payments, and the, 
lacement mentioned would not affect computations. E 


416 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


parameters.‘ To be sure, in at least one case that has been investigated, 
i.e. the annual model given above, the bias proved to be smaller for 
most parameters than the estimated sampling errors.’ We do not know 
that this will always be the case. In any event the compulsion to choose 
as dependent one only among several variables, all of which clearly 
are endogenous, is unwelcome. Therefore we decided to use unbiased 
or consistent methods of estimation. 

To obtain consistent estimates of the parameters in (2.1) to (2.3), 
the equations must be regarded as simultaneous and the system treated 
as a whole, With six endogenous and nine exogenous or predetermined 
variables, estimation by the limited-information maximum-likelihood 
method is fairly laborious; yet this system is the closest analogue to 
the annual model, and computation of its parameters was undertaken,’ 
The results contradicted the assumption that the disturbances (ш, 
Us and из) were random. Quarterly are more highly autocorrelated than 
annual variables, and it is not surprising that the same should be true 
of the residuals from a regression between such variables. 


An obvious device is to assume that the disturbances satisfy a lag 
correlation scheme such as 


3 
ш = У pulu) + v (v; mutually independent). 
ia 


However, to estimate pi; simultaneously with the other parameters by 
the method of maxifnum likelihood would be burdensome. 

In the limited information method, each individual equation of the 
system uses only restrictions on the parameters of that equation and 
not those on other equations. Thus it would be possible to obtain 


limited information estimates if each disturbance satisfied a pure auto- 
regressive equation Т 


==> e 


Us = pu). + v; (v; mutually independent). 


Yet the computational burden imposed even with this simplification 
was more than we wanted to undertake at this stage of research. 


A RECURSIVE MODEL 


Our procedure was somewhat different. We converted the system 
' of equations to recursive form, thus enabling us to obtain consistent 


‘See, e.g., T. Koopmans, “Statistical Estimation of Si i ions,” Journal 
: E tions,” Jow 
Af the American Statistical Association, 40 (1045), рр dag о 210048 Relationa 
* Klein, op. cit. Compare рр. 68, 75. 
D b Br Focedure see, e.g., T. W. Anderson and Н. Rubin, “Estimation of the Parame- 
quation in a Compl i i 
Statistics, XX (1049), pp. 48-83, — te of Stochastic теу Moe 


» * ә 


A QUARTERLY MODEL FOR THE UNITED STATES ECONOMY 417 


estimates by repeated application of single-equation least-squares 
methods." It proved computationally feasible to do this while assuming 
that the disturbances satisfied first-order autoregressive equations. For 
this arrangement the first equation to be estimated must have only one 
endogenous variable, the remaining variables being predetermined or 
exogenous. Each subsequent equation must contribute only one addi- 
tional endogenous variable. In estimating an equation that contains 
more than one endogenous variable, in the case of all but one such 
variable the caleulated rather than the observed values are used in 
the computations. The procedure leads to consistent estimates. In some 
recursive systems, the assumption that unlagged disturbances in 
separate equations are independent makes maximum likelihood esti- 
mates obtainable by repeated application of the method of least 
squares. If we drop these independence assumptions and substitute 
calculated instead of observed values of endogenous variables in suc- 
cessive equations, we obtain consistent but not necessarily full maxi- 
mum-likelihood estimates. 

From a formal standpoint it makes no difference which equation we 
estimate first, provided it can be written with one endogenous variable 
ав a function of predetermined variables alone. We believe that in- 
vestment decisions depend upon a longer range of past experience, and 
result more slowly in actual expenditures, than other types of decision. 
Therefore we put fi-zero in equation (2.2), estimate the remaining 
parameters in this equation by least squares, and’ use the calculated 
value of investment as an exogenous variable in the consumption 
equation. This treatment allows consumption to respond immediately 
to changes in income and permits consistent estimation by least 
squares. On the other hand, the distinction between wage and non- 
Wage income, as in equation (2.1), has to be abandoned. On substituting 
calculated (denoted by superior ~ for observed values of I, 


Y=C+T+@-T; 
and we may write: 
C = a + oY + uy! 
E a’ = о! 
tn durs 
= a" + a" (T 4- G — T) Hw". 
The disadvantage resulting from the consolidation of wage and non- 


"Herman Wold and L. Juréen, Demand Analysis (New York: Wiley, 1953), p. 14. 
с с Ф 


(£--G — T) + ш" 


418 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 19 


instead of Y, national income.® We shall further introduce lagged con- 
sumption on the right hand side, to take account of the influence of 
past behavior Finally we substitute computed values on the right 
hand side of the wage equation (2.3) and estimate this also by the 
method of least squares. 

"The above procedures yield the following fully recursive model which 
may be estimated consistently by least squares methods. 


Model A (Fully recursive) i 


(3.1) { C = а + а(0)-3 + a + 9-1) + ш 
j w = р(ш) 1 dn 


(3.2) { I = Bo + Bi(11). з» + (0). + B3 (IT) 1/3 + (K) + 
Uz = pa(us)-s + ve 

OON tor cos NRI 
Us = ps(us)-1 + v3 


(8.4) ФТ = ҮФТ=-С+І+С 
(8.5) Yo? -T=04+Wi+W, 
(8.6) I-K-(K) 


Equation (3.2) must be estimated first, and the calculated values for 
substituted in (3:1). When (3.1) has been estimated, values of @ and 
are substituted in (3.3). The complication introduced by the auto- 

_Tegressive treatment of the disturbances is discussed:in the Appendix. 
More generally, it may be seen that the condition for recursive treat- 
ment (i.e. consistent estimation in stepwise fashion) is that the Jacobian 
of the transformation connecting the disturbances with the endogenous 
Variables shall be triamülar and (by the rule of normalization) equal 

to a constant, unity. Thus, in the present example 
9(0, їл, 03) mmm 
80,0, W) 0 1-¥y,/=1. 

[0:1 


The reader Will observe that the first kohin of the Jacobian refers t0 


* Y' is obtained from Y by 
ments; and deducting persona] 


not allow the wage and nonwage income 
? This type of lag relation has been 
“Habit Peat 
T. M. Brown, “Habit Persistence and Laga in Consumer Behavior," Kconometrica, 20 (1952), pp. 355-1 
D 
т 2 


A QUARTERLY MODEL FOR THE UNITED STATES ECONOMY 419 


equation (3.2), estimated first; the second column to equation (3.1), 
estimated next; and the third column to equation (3.3), estimated last. 


A HYBRID MODEL 


Wesaw above that the simultaneous estimation of all three equations 
requires the assumption that disturbances are random, and that in 
practice with quarterly data this assumption is contradicted. To avoid 
this difficulty we developed a recursive model, as just explained. A 
further alternative is to use a model which is partly recursive and 
partly simultaneous. Thus we may estimate Ї as above and substitute 
the calculated values in the remaining two equations, and then proceed 
to estimate the latter simultaneously by the method of limited informa- 
tion (instead of successively by the method of least squares). If we take 
our consumption equation 


€ = ау + ex (C)-3/2 T as (Wi + Wa) + es IT T ш’, 

and for П write 
f=c+7T+@-Wi-W—T, 
we obtain (4.1) below. 
Model B (Hybrid) 

л) C= о + (C) + (Wa + Wa) + all + G — T) + ta 

{ I = Bo + h(s + B (II) ss + 8:0) ara + ВК) 1. + ue 

Us = pelu) + v» ? 

(k3) Wy = yotn(CHI+G—Ws) 0-1-6 W) t ytt s 
(4.4) Y+T=C+I+¢ 
(4.5) Y 2 It 4- Wi; +W: 
Eo I-K- (K) 
Here consumption depends separately upon wage and nonwage income, 
as in the original annual model, and this feature makes fully recursive 
treatment impossible. Equation (4.2) is estimated by the method of 
least squares as before,-but even after calculated values for I have 
been substituted, equations (4.1 and (4.3) both contain more than one 
endogenous variable. However we estimated (4.1) and (4.3) con- 


sistently by doing so simultaneously, using computational procedures 
to which reference has already been made.” 

£s Lo у ш UA 

1 Anderson and Rubin, op. cit. Equations (4.1) end (4.8) do not contain autoregressive error tere 


because of the heavy additional burden of computation that their introduction would impose: see 
ion above. 1 


(4.2) 


E . 


420 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


SEASONAL VARIATION IN QUARTERLY DATA 


In the later sections of this paper the parameters in Models A and B 
are estimated, and their predictive power tested, using seasonally ad- 
justed data throughout. The practical necessity of using corrected 
data can readily be demonstrated. In the first place, some components 
of national income (e.g. farm operators' income) already have been 
partially corrected, and other components (e.g. business profits) have 
been fully corrected for seasonal variation during original collection or 
subsequent processing of the data; such series are not available in 
“seasonally unadjusted” form. The implications of using data, some of 
which have and some of which have not been corrected for seasonal 
movement, are obscure. 

Tn the second place, if seasonally unadjusted data are to be used, the 
seasonal behavior has to be incorporated into the model itself. A simple 
and plausible representation of seasonal behavior requires multiplica- 
tion by a parameter that varies only with the season. Thus if 


(5.1) Y = аа + oe +--+ +4 


is a structural equation in the absence of Seasonal variation, to allow for 
such variation in the model we need to write 


(52) Y = (f! + Baza! + Brza! + Bice!) (ones + az t... )+u 


where z;' assumes the value unity in the ith quarter and is zero in all 
other Quarters. If the В as well as the a have to be estimated from the 
data, it will be found that the estimating equations do not readily admit 
of numerical solution since the behavior equation (5.2) is nonlinear in 
the parameters,” An advantage of using unadjusted data in this way 
would be knowledge of the number of degrees of freedom absorbed in 
estimating the 8. The adjustment of data for seasonal variation also 
uses up degrees of freeddin; but we rfever know just how many degrees 
are absorbed in the process. 

И However, theoretical superiority does not seem to lie wholly on the 
side of unadjusted data, It may be urged that the economic subject 
makes his own (rough) seasonal corrections as he goes along. The con- 
sumer does not react to his income and to the time of year as separate 
data, but asks himself, “Is this more or less than the income I would 
expect at this time of year?” The entrepreneur looks at his profits and 


" Beo also L. Hurwiez, “Variable Parameters in Stochastic Processes: Trerd and Seasonality,” 


E. Inference in Dynamic Economic Models, ed. T. C. Koopmans (New York: Wiley, 1950), 
9 LJ 


D 


A QUARTERLY MODEL FOR THE UNITED STATES ECONOMY 421 


compares them with some level expected for the season. Hach applies a 
rough seasonal correction he carries in his head. 

If this version of the facts is accepted, seasonally adjusted rather 
than unadjusted data appear to be the more accurate measure of the 
variables that interest us. However, the seasonal correction effected by 
the economic subject must be based wholly on past experience, al- 
though it doubtless is continually revised as history unfolds. Of course, 
seasonal corrections made by statisticians are based on data for the 
current year, earlier years, and later years. The theoretical justification 
for using standard methóds to correct the raw data for seasonal varia- 
tion is therefore incomplete. The most we can say is that seasonal ad- 
justment by conventional methods enables us to approximate more 
closely the variables most relevant to economie behavior than no 
seasonal adjustment at all. 


SAMPLE DATA USED FOR ESTIMATION 


Parameters in both models A and B were estimated from 72 quarterly 
observations for the years 1923-1940. All observations were expressed 
in eonstant (1939) prices. Variables were defined in accordance with 
Department of Commerce usage? except that we substituted a fresh 
series for capital consumption allowances. Annual Commerce figures 
for 1929-1938 were interpolated with data from Barger’s Outlay and 
Income in the United States and were extrapolated back to 1921 in the 
same way. All data are seasonally adjusted quarterly totals and are ex- 
pressed in $ million in 1939 prices,“ excepting only ¢ which numbers the 
quarters consecutively. 4 


MODEL A (FULLY RECURSIVE) 


Investment equation. Equation (3.2) has’ the following coefficients 
when estimated by least squares. = Bey 


"See Survey of Current Business, “National Income Supplements.” II comprises business profits 
(corporate and noncorporate, including income of farmers and the independent professions), and interest 
And rents, before tax. 

? New York: National Bureau of Economie Research, 1942. { | з 
| u The various components of gross national product were deflated with readily available price 
indexes, Capital consumption allowances were deducted from gross national product (both in 1980 
Prices), yielding net national product. A comparison of the latter, quarter by quarter,,with net national 
Product in current prices yielded a single implicit price index which was used to deflate all components 

income, 
Ф 3 To print the sample data would require excessive space. Magnitudes may be indicated by quoting 
№е following mean values for the sample period: C, 14,588; Ws 9,784: П, 4126; D 80670, 152001 
pp 1,343; Т, 1,674; б, 2,396; Y", 16,173. Values of K and t are zero for the last quarter of 1922; and 
1080 and 72 respectively for the last quarter of 1940. e 


Ff 
422 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 194 


I = — 2016 + 0.438(П)—» + 0.615(1I) ; — 0.205(ID. 
(0.403) (0.360) (0.429) 


— 0.054(K).3. + us 


n (0.038) 


и = 0.603(и) 1 + v2 К 
(0.211) 


Tn parentheses are shown estimates of the sampling errors, whose exact 
distributions are not known. While sampling errors for individual co- 
efficients of (II) sj, (I1)-7/2, and (П) цуе are quite large, the corre- 
sponding error for their sum is smaller than any of the separate errors, 
being 0.294. The statistic ô?/s? (=2.016) may be computed as the ratio 
of the mean square successive difference of the residuals v» to their 
variance. The distribution of the statistic is such that for a random 
series with (say) 60 degrees of freedom the probability is 0.95 that 0/8 
will exceed 1.6.1 Hence the result is compatible with the assumption 
that the disturbances are random. 

Consumption equation. As the next step we substitute Î from (6.1) on 
the right hand side of (3.1). Estimation of the latter equation yielded 
results compatible with the assumption of random v, but reported а 
negative value for o». This result implies that consumption depends in- 
versely on current income, a conclusion we rejected. Deciding that аз 
must be positive, we put p; identically equal to zero, and contented 
ourselves with the estimate j 5 


(0.2) © = 266 + 0.990(C)_s2 + 0.036 + G — T^) + u 
which is equivalent to Nek 


(6.3) С = 257 + 0.955(C)-s/2 + 0.0857" + ш’ 


for which 3,/s,— 1.35, indicating significant autocorrelation of the re- 
siduals, An autoregression coefficient computed from the observed re- 
siduals of (6.2) was 5,—0.329. 

Because of the transformation of variables, the confidence intervals 
for the coeffteients in (6.3) are not symmetrical about the point esti- 
mates, If ay’ is the coefficient of (C). yj, and as’ of ^ in (6.3), the limits 
at the 95% level are approximately 


з 


"В. І. Hart and J. von Neumann, “Tabulati iliti i Meas 
тА , tion of the Probabilities for the Ratio of the 
pe pu ve Рїййгепсе to the Variance,” Anzüls of Moohematical Statistics, XIII (1942), PP: 
+ We of course use only the lower tail of tho distribution, 
t 3 


° 
. е ә 


RLY MODEL FOR THE UNITED STATES ECONOMY | 423 
.931 < a’ < 0.984; and — 0.072 S а € 0.122. 


Model A this consumption function was the least implausible we 
ined in a number of trials, with and without lagged values of con- 
on, with and without autoregressive transformation of disturb- 
Plainly it suffers from three serious defects. (1) The residuals 
be considered random. (2) The low value of оз (or оз) furnishes 
ery weak link with the investment equation. (3) The confidence 
for о» (or o") include negative values. 

marginal propensity to consume. Despite the fact that (С) зг is 
minating variable in (6.3), Y" does play a larger role as the time 
hich the equation functions is lengthened. Let 


(C): = ao + ei(C)o + as( Y^ + (и). 


n—l n—l n—l 
„= a У o! + aj^(C)o + оз Dy (Y^) + DX аи). 
24-0 


i=0 i=0 
limit, if we assume a steady income stream, we have 
ao 


+- yt 


l-a l-a 


Cz 


vis a linear combination of random variables. Evidently we may 
td «:/(1—a;) as a long-run marginal propensity to consume, its. 
ted value being about 0.78. We therefore report a sizeable in- 
e of income upon consumption in the long run, even though (6.3) 
the appearance of almost pure autoregression. j | 
е can also construct a joint confidence region for a and оз in the 
of an ellipse, to see whether а/(1— 01) is estimated in the range 
) even though оз is not itself significarftly positive.” Our estimate 
he propensity to consume proves very rough, for about one-tenth 
area of an ellipse drawn at the 95% level admits negative values. 
he wage equation. The calculation of (3.3) yielded an estimate of рз 
tically equal to unity and substantial autocorrelation of the re- 
als vs. Moreover (71-72) was estimated well below 0.4, although 
al models have shown the marginal influence of (Q;HI4-G — Ws) 
between 0.5 and 0.6. Neither estimation of the equation in terms 
t differences (i.e. putting p identically equal to unity), the addi- 


n example of the preparation of this type of confidence region is given in T. Haavelmo, “Meth- 
ensuring the Marginal Propensity të Consume,” Journal of the Anterican Statistical Associa- 
(1947), pp. 105-22. d 


424 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


tion of longer lags for the independent variable, the insertion of higher 
powers of t, nor the use of a second-order autoregressive scheme for the 
disturbances, led to improved results. Nor were lagged values of the de- 
pendent variable helpful on the right hand side of the equation. Prob- 
ably the distribution of income cannot be satisfactorily approximated 
by any simple linear relation. In the present study we contented our- 
selves with the following equation in which ps is identically zero: 


(6.5) W,-665--0.759(C--T--G — W;) —0. 178(0--I--G— W;) 4 
(0.210) (0.212) 
— 0.9531 + us 
(4.3) 


The estimated coefficients are reasonable enough, but 6°/s?=1.21 (i.e. 
clearly <1.6). The residuals therefore are autocorrelated, and in fact 
yield an autoregression coefficient of 0.401. Despite a common belief 
that the distribution of income was shifting during the period that we 


chose for our sample, the coefficient of { proves not to be significantly 
different from zero. 


MODEL B (HYBRID) 


Investment equation. It will be recalled that equations (4.2) and (3.2) 
are identical, investment being estimated in the same fashion in both 
models, Caleulated values, Ї, are therefore obtained from (6.1) above. 

Consumption and wage equations. Parameters in equations (4.1) and 
(4.8) need to be estimated by the simultaneous treatment of these two 
equations.!* The results, shown as (6.6), (6.7), and (6.8), are consistent 
but not efficient. Because endogenous variables appear on the right 
hand side, estimation of (431) and (4,3) independently by least squares 
leads to biased results. Estimates of the coefficients obtained separately 
for each equation by least Squares are shown in (6.7) and (6.8) in 
parentheses above the unbiased estimates, and suggest that the bias is 
not quantitatively important in the present case. Sampling errors of the 


unbiased estimates are shown in parentheses below the latter. 


(6.6) C = 1061-F0.721(W.--W;) -0.383(C). y, —0.192(0--G— T)+u 


* (0.174) (0.146) (0.065) 
which is equivalent to 
(1375)(4-0.696) (0.443)  (—0'257) 


(6.7) C= 1313 +0.655(Wi+ We) #0.474(C)_5.—0.238TI-+ u’ 


?5 For computational Procedure, see Anderson and Rubin, op. cit? 


D = ә 


| 


A QUARTERLY MODEL FOR THE UNITED STATES ECONOMY 425 


(570)(+-0.546) (—0.041) 
(68) Wi = 604 4-0.706(C--T--G —W;) —0.117(0--I--G— We) 
(0.098) (0.096) 
(—1.67) 
—2.815t4-us 
(1.97) 


As before we calculated the residuals and estimated p from them. 
For (6.6) 
à? PES 
— = 1.67 pı = 0.177, 
s? 
and for (6.8) 


ô? E 

— = 1.45 ps = 0.288. 

s? 
For the consumption equation the result is just compatible with the 
assumption of random disturbances, but the wage equation, as in 
Model A, yields residuals that show significant autocorrelation. As in 


- Model A, the investment equation is not solidly linked to the others: 


although lagged consumption plays a smaller role than in (6.2) and 
(6.3), the coefficients of Г in (6.6) and of TI in (6.7) have the wrong sign. 


TESTING THE MODELS 


How well do our models represent behavior, first during and second 
outside the sample period? The acid test isthe model’s ability to pre- 
dict. Of course the forecasts of models such_2s ours are conditional, in 
the sense that the values of the exogenous (but not of the endogenous) 
variables must be known for the period to which the forecast applies, 
To be sure the practical forecaster may regard this as а fatal disad- 
Vantage; here, however, we are concerned with prediction, not for its 
own sake, but only as a means of testing the models. The sample period 
Comes to an end in 1940. Accordingly we shall test the model by making 
Predictions for 1941 and for 1947 through 1952, omitting the war years 
48 not relevant for the purpose in hand. 

The proceduxe is as follows. We can estimate the values of the endog- 
enous variables for any quarter (say, 18 of 1947) on the basis of 
observed vatues of all quaitities for preceding quarters (through 4th of 
1946) and data for exogenous variables only (Ws, T or 7", and G) for 


e 


426 AMERICAN STATISTICAL ASSOCIATION J OURNAL, БЕРТЕМВЕ| 


the current quarter (Ist of 1947), and compare the estimates 
tained with the observed values of the endogenous variables 
current quarter (1st of 1947). We can repeat this for the 2nd qua 
1947, using complete data for the first quarter and exogenous va 
for the second quarter, and so on. Thus we build up a success 
short-range predictions, each one quarter ahead. The questio) 
then be posed, whether the performance of the model is app: 
better than guesswork. 

Solution of the three equations for each model, together with th 
accounting identities, yields estimates of six quantities: gross nati 
product (GNP), national income, and the two endogenous 
ponents of each (I, C; Wi, П). Of course the six predictions are 


by chance, on the assump 
are equally likely, 
sive quarters are i 


rection. For this more stringent test, the chances of correct prediction 
1 quarters appear to be completely independent. In s 
language, it perhaps is not too hard to guess that an upward or do 


19 To predict GNP, 
2 Results of 


about to be given fori 
Bh r investment and GNP—as 


by chance, a result рейарв 
referred to below. 


A QUARTERLY MODEL FOR THE UNITED STATES ECONOMY 427 


ward sweep will continue; it is less easy correctly to predict reversals of 
direction. 

Predicting investment. The investment equation (6.1), common to 
both models A and B, yields the results in Table 1. In predicting 28 
of the 40 reversals of direction during the sample period, the investment 
equation obviously does much better than could have been achieved by 
accident. The result during 1947-1953 is less impressive, for only 7 of 9 
reversals were correctly predicted, the chance of so good a result occur- 
ring by accident being about 1 in 10. 


TABLE 1 


INVESTMENT: SUCCESSIVE QUARTERLY PREDICTIONS 
(s in $ million at 1939 prices) 


Quarters in Which Direction 


All Quarters of Change was Reversed 


8 р P p Р, 


1998 {о 1940 
(sample period) 


477 51/72 .001 28/40 .008 
1941 
343 3/4 .31 0/1 1.00 
1947 to 1952 
567 17/24 .032 7/9 .09 


Since the investment equation (3.2) can be written in the form 


I = pI) + Boll — p) + Du — р) в] 
+ 8101) зе — P-o] + ВА) е — po(I1) 13/2] 
+ Bi[(K)-1 — p2(K)-2] + vs 


We can estimate the variance of forecast values of Г as а sum of prod- 
ucts, each product being obtained by multiplying an estimated variance 
or covariance of 8, 82, Bs, Ba, and р» (or of their products) by a square or 
Product of values of predetermined variables in the forecast period, to- 
gether with an estimate of the residual variance." Taking as values of 
the predetermined variables their mean level during the 24 forecast 
Quarters 1947-1952, we can approximate а standard error of forecast 


"See, e, Н. Hotelling, "Problems of Prediction,” American Journal of Sociology, 48 
(1942-43), pp, 61-76. « 7 


428 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1054 


for the postwar period of $907 million (1939 prices). The root mean 
square of the observed errors of prediction for 1947-1952 (s in Table 1) 
was somewhat smaller than this. 

Predicting gross national product. The performance of the complete 
models A and B is conveniently tested by predicting GNP.” We did 
not estimate the standard error of forecast for GNP because of the 
heavy computation required. But we carried out the other tests de- 
scribed above, both for models A and B and for three “guesswork 
models” of progressively increasing sophistication. Call GNP y and in- 
dicate lags, as before, by suffixes. Assume first that GN P will be the 
same this quarter as last quarter: 


(Guesswork Model T) y = (y). 


Embodying the simplest possible assumption, this model cannot of 
course be used for forecasting direction of change, but it affords a cri- 
terion for measuring errors of prediction. 

Assume, second, that GNP’s change from last quarter will be one- 
half GNP's change last quarter from the preceding quarter: 


(Guesswork Model II) y = (u) + 3G)a — (y)-2]. 


The fraction one-half is chosen arbitrarily on the assumption that for a 
stable series such as GNP the autoregression coefficient between first 
differences lies between 0 and 1.2 

Thirdly, let us fit’a second-order difference equation to the observed 
values of GNP during the sample period (1923-1940): 


(Guesswork Model IIT) у = 4858 40.8730), – 0.1250)... 


Model IIT is perhaps no longer pure guesswork, but it still will serve as 
а standard of performance against which to test our two econometric 
а A and B. Results of’ the tests are shown in Tables 2, 3, and 4. 
During the sample period, both of the econometric models fit the ob- 
served data markedly better than any of those based on guesswork 
(Table 2), However, in predicting reversals of direction, only Model A 
performs significantly better than might be expected from chance. 


Hi 2 T jqusle«C +I +G) together with depreciation, all in 1939 prices. C and I are predicted by 
el; @ and depreciation are exogenous, When Predicting with models A and B, we use the Te 


tive vah ` E 
iii ues of pi and pr obtained from residuals observed during the sample period and quoted 


A QUARTERLY MODEL FOR THE UNITED STATES ECONOMY 429 


TABLE 2 


GROSS NATIONAL PRODUCT: SUCCESSIVE QUARTERLY 
PREDICTIONS, 1923-1940 (SAMPLE PERIOD) 


(s in $ million at 1939 prices) 


32 Quarters in Which 
All 72 Quarters Direction of Change 
Was Reversed 


8 Р p p P 
Econometric 
Model 
A 652 54/72 -0001 24/32 .003 
B 708 51/72 .0002 20/32 .108 
Guesswork 
Model 
I 760 36/72 5 16/32 57 
п 805 40/72 2 0/32 1.00 
III 939 36/72 5 15/32 At 


During 1941—the year immediately following the close of the sample 
period—GNP rose steadily, so that no reversals of direction occurred 
(Table 3). Here, because the period contains no turning point, Guess- 
work Model II performs as well as the econometric models. 


© 


TABLE 3 


GROSS NATIONAL PRODUCT: SUCCESSIVE QUARTERLY 
r PREDICTIONS, 1941 


(s in $ million at 1939 prices) 


Р E б 
i 

Econometric Model 
à EA. 4/4 .06 
B 1,857 4/4 Xx 

Guesswork Model 

I 1,118 2/4 3 
II 594 4/4 i$ 
ш 3,115 oe A aba 


We also carried out the tests for the postwar period, with the results 
shown in Tablé 4. Again the econometric models perform much better 
than any of the attempts at guesswork. Although during 1947-1952 


© 


(430 ` AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


TABLE 4 


GROSS NATIONAL PRODUCT: SUCCESSIVE QUARTERLY 
PREDICTIONS, 1947-1952 ~~ 


(s in $ million at 1939 prices) 


11 Quarters in Which 
All 24 Quarters ` Direction of Change 
5 Was Reversed 


8 p p p He 
Econometric 
Model 
A 651 19/24 ` .003 8/11 ‚118 
B 1,087 18/24 .011 9/11 .033 
|. Quesswork 
Model 
I 678 12/24 ‚58 54/11 .61 
IL 687 13/24 .42 0/11 1.0 
ш 4,499 6/24 .99 E 5/11 ‚78 


they perform less satisfactorily than during the sample period, both 
Models A and B show significant predictive power if all 24 predictions 
are considered independent. On the other hand if the test is confined to 
the prediction of reversals of direction in GNP, the result is less clear 
cut. For Model B, chances are 30 to 1 against accidentally calling 9 of 
the 11 turning points; but for Model A the chances are only 10 to 1 
against calling 8 of the turns. Yet the consistently superior performance 


of Model A, when measured by root mean square error of prediction, is 
noteworthy. 


A LONG-RANGE EXTRAPOLATION 


Instead of a series of short-range predictions, each one quarter into 
the future, we may project a single extrapolation as far ahead as we 
Please, provided only that the exogenous variables are known or can 
be estimated, Suppose a computer to sharpen his pencil after close of 
business on December 31, 1946. He is supplied with the model, with 
data for all variables up to and including the quarter now ended, and 
with (correctly) anticipated values of the exogenous variables for each 
quarter through the end of 1952. His forecasts of investment and 
GNP (both in 1939 prices) are shown in Charts I and II. Predicted in- 
vestment rises sharply to a peak in 1948 two quarters earlier than the 
downturn in acfual investment. Predicted investment then declines 


A QUARTERLY MODEL FOR THE UNITED STATES ECONOMY | 431 


Billions of 1939 dollars per quarter 


Date of forecast 


1945 1946 1947 1948 1949 1950 1951 1952 


‚Онлвт I. Net Private Domestic Investment, as actually observed and as pre- 
еа by Models A and B. The predictions are assumed {о be made as of Dec. 
1, 1946. у 


steadily for three years, missing entirely the pre- and post-Korea boom, 
turning up again only in the summer of 1951. In the case of GNP, the 
upward trend from 1947 to 1952 is correctly predicted, but the wave- 
like movement forecast for 1949-50 did not eventuate. 

_Summary totals for each variable—actual and predicted—for the 
six-year period are shown in Table 5. | 

For GNP, root mean square differences (s) between prediction and 
observation for the 24 quarters are: Models A and B, $2.1 and 2.7 bile 
lion respectively; Models I, II, and Ш, $3.9, 5.6, and 15.8 billion те; 
spectively. - 6, 

Although the econometric models predict six-year totals for GNP 
and national income better than any of the guesswork models, the 
same is not uniformly true of the components. Indeed Guesswork 
Model I, whose performance elsewhere was so indifferent, happens 
here to score a bulls-eye in forecasting the six-year total for investment! 


e 9 D 


432 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


Billions of 1939 dollars per quarter 
44 


: 


Date of forecast 


^ 


^ 
2^ ^" Model A 
^ 


oe eee, ig 
1945 1946 1947 1948 1949 1950 1951 1952 


Cuarr П. Gross National Product, as actually observed and as predicted by 
Model 


els A and B. Tlie predictions are assumed to be made as of Dec. 31, 1946. 


TABLE 5 i 
LONG-RANGE PREDICTIONS, 1947-1952 


(Six-year totals in $ billion at 1939 prices) 
Assumed date of forecast: December 31, 1946 


TC) GNP ow, п r 
As actually observed 


Econometric model 


60 566 865 421 210 704 


<B 63 584 885 434 218 тм 
Guesswork model 

I 10 546 790 435 .143 650 

II 27 52 740 419 138 622 


II. n 


ot computed 493 not computed 
ра v 2 ic artis fT a en =, DA 


A QUARTERLY MODEL FOR THE UNITED STATES ECONOMY 433 


Even so unsophisticated a guess must sometimes turn out correct. 
Some of the errors of the econometric models can readily be ra- 
tionalized. Indeed, the diagnosis of errors is very necessary if models 
are to be improved. In the present instance, the overstatement of in- 
vestment by the econometric models may be explained by our use of 
nonwage income before taxes, II, as independent variable; taxes (es- 
pecially here corporate taxes) are notoriously higher today than during 
the sample period.^^ Moreover II itself is overstated, and this is related 
to the understatement of Wi; a shift appears to have occurred in the 
distribution of income since the close of the sample period. To adapt 
the models to take account of these partieular changes would not be 
difficult, but such revisions lie outside the scope of this paper. 


COMPARISON OF ANNUAL WITH QUARTERLY MODELS 


Annual data cover up movements occurring within the year. A model 
fitted to annual data contains less information than a quarterly model. 
The former will not yield quarterly information without many arbitrary 
assumptions, but a quarterly model can always be converted to an 
annual basis by summation. 

To approximate an annual investment equation, we combine (3.2) 
and (3.6), putting p»— zero and using the values for other parameters 
given in (6.1). We obtain 


s+ (22+ (0): +1 4 
= A + 0.438 (ID) ys + 0.414(IDus + 1.007 (ID) s ‘ 
+ 0.953(II) зу + 0.346 (IL) .5/2 + 0.327 (II) zs — 0.184(II) o2 
— 0.174(П)-п• — 0.198(К)— + U 


where A is a constant and U is a linear combination of disturbances 
which for the present purpose we may consider random. Time zero 
is February 15 of the year for whose four quarters investment is esti- 
mated. Taking (П), and (П) у», together with one-half (I1)-12 85 relat- 
ing to the current year, dividing the remaining lagged values of II be- 
tween the two preceding years, summing the coefficients, and dividing 
by 4, we obtain the following annual equation (time zero becoming 
June 30): е +, 


4 At the time our model was estimated, no adequate breakdown of income taxes by kind of income 
was available, Annual estimates for disposable income have only recently been compiled in a fashion to 
segregate wage from dither income: see Lenore Frane and L. R. Klein, “The Estimation of Disposable 
Income by Distributive Shares,” Review of Economics and Statistics, Vol. XXXV (1953), pp. 333-37. 

% To derive this result we substijute [К —(K)-1] for I in (6.1) and compute [К»—(К)-а] by re- 
cursion starting out with the expression for Æ, obtaining (К), then (К), and then (Ov. 

с 


= E 


434 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 195 
(7.1) I = A + 0.3411 + 0.51(П)—‹ — 0.07(П)—. — 0.20(K)_, + U. 


This compares with the following equation obtained directly from an- 
nual data: 


(7.2) I = A + 0.231 + 0.55(1)-1 — 0.15(K).. + U. 


The long run interpretation of the quarterly consumption equation 
(6.3) has already been discussed. To convert this conveniently to an 
annual basis, write (C)_s2=3[(C)1+(C)_.]. After some substitution 
and rearrangement the following is obtained (lags refer to quarters): 


(C)s + (0): + (С), + C 
= A + 0.607(C)_ + 0.877[(C)_2 + (C). + (C).4] 
+ 0.270(C)_s + 0.035(Y)s + 0.051(Y); + 0.076(Y): 
+ 0.096Y + 0.061(Y)_, + 0.044(У)_. + 0.020(Y). 
+ U. 

where as before A is a constant and U is a linear combination of vari- 


ables that may be considered random, Summing coefficients applicable 


to respective years and dividing by 4 we obtain as an annual consump- 
tion equation: 


(7.3) C = A + 0.8100); + 0.07(C)-. + 0.06Y + 0.03(Y).; + U. 


The autoregressive component, although not quite so overwhelming 
as in the quarterly equation from which it is derived, still is far stronger 
than that obtainable directly from annual data 27 


(7.4) C =A + 046(C).; + 0.81Y + 0.034 + U. 


The consumption equation in Model B may also be adapted to an 
annual basis, A procedure-cirictly analogous to that used with equation 
(6.3) when applied instead to (6.7) yields the following annual con- 


sumption equation: 
СА 0.1200), + 0.01(С)_. + 0.89(W, + W) 
(7.5) + 0.20(Wi + W3., — 0.321 — 0.07 (I1). + U. 


The closest-comparable equation obtained directly from annual data re- 
Ports positive coefficients for II and (П)_,:% 


(7.6) C=A4 0.80(W; + W;) + 0.020 T023a4d-U. 
* Klein, op. cit., p. 68, 
27 Carl Christ, “A ‘fest 
28 Klein, op. cit., p. 68. 


of an Econometric Model for the U.&,” Conference on Business Cycles, P. 7: 


eine E A E : 


+ 


A QUARTERLY MODEL FOR THE UNITED STATES ECONOMY 435 


The two wage equations yield the following annual versions. From 
(6.5): 
(17) Йу = A+ 0.656(C +I +G- №) — 0.0710 +I +G- Wa) 
+ 16t +U. 


From (6.8): 
(7.8)Wi= A+ 0.63(¢ +I + G — №) 
— 004(C +I +G- №) t 45t + U. 
Computed from annual data:” 
(7.9) з = A + 0.42(C + 1+@—›) 
+ 0.16(C +I + G — Wa + 130t + U. 


It will be seen that the sum of the coefficients of (C +I1+G—W;), this 
year and last year, is practically the same in all three equations. How- 
ever individual coefficients differ somewhat, and the time trend re- 
ported from annual data is much stronger than in either of our quar- 
terly models. 


CONCLUSION 


To the question whether it is possible satisfactorily to represent quar- 
terly movements of gross national product in the U. 8. economy by as 
simple an equation system as that discussed here, these results offer no 
conclusive answer. | 

An inspection of the two models described reveals three main weak- 
nesses. (1) In each model at least one of the three equations showed 
significant autocorrelation of residuals. (2) In the recursive model (A) 
the coefficient of income in the consumption equation is small, and its 
significance could not be established, so thet the linkage between the 
first two equations in this model is poor. The hybrid model (B) shows 
a different, though probably related weakness—the coefficient of non- 
wage income in the consumption equation is negative. (3) In both 
oe the sampling errors of some coefficients are uncomfortably - 

arge. 

Constructed from dates for 1923-1940, the models were tested by 
their performance during 1947-1952. (1) In a series of short-range Pre 
dictions one quarter ahead both models performed uniformly better 
than guesswork, but their superiority was not decisive in a statistical 
sense. Probabilities against results being obtained by chance ranged 


з Ibid, 


@ 
3 


^ cr М LI 


436 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


from about 10 to 1 to much higher and clearly significant odds, de- 
pending upon the extent to which successive predictions were assumed 
to be independent. (2) In a single long-range prediction by each model 
for the entire six-year period, the econometric models forecast GNP 
with a smaller root mean square error than any of the guesswork 
schemes. In terms of components of GNP it is not possible to say that 
the models did appreciably better than guesswork. 

Evidently room for improvement is large. The obvious advantage of 
more plentiful observations, both in constructing and testing of models, 
when quarterly data are used will not be fülly realized until further 
progress has been made with the autocorrelation problem. It may well 


be that some of the defects of the models discussed can be overcome. - 


only through the use of a more detailed and complex system of equa- 
tions. 


APPENDIX 


The equations for obtaining least-squares estimates of the param- 
eters o; of an equation with autocorrelated disturbances 


(8.1) T= a + mt + +++ о, tu 

u = p(u).i +v w random) 
are obtained by choosing that set of estimates of o; and p which mini- 
mize У 170°, The sample observations cover the period і=1, 2, =- T. 


We can rewrite (8,1): 
(8.2) — p(z) = a(l — p) + aia — р(а)а] + : 
+ anlen — p(2,)-3] + v. 


(It will be noticed that if p=1, the case is equivalent to the use of first 


differences.) Assume that dll variables are measured from their means, 
and write cam 


^ 


X-2z— р(2)_1 
2:=2— p(z) 
Minimization of Суту with Tespect to æ; then leads to the familiar 
equations 


P р SE 
(8.3) Y (XZ,) = ad (Ай) + ...+ 2» (27) 


бсо с 


We proceed to minimize У» with Tespect to p and obtain 


| 


muc > „& 
* pom 


A QUARTERLY MODEL FOR THE UNITED STATES ECONOMY WS aay 


(8.4) Y [к а +++ Za] 


[== a(z) qoc anla) = 0. 


The sums of (8.3) are all quadratic expressions in p, 80 that we can 
solve for o; in terms of p. The solution is in each case the ratio of two 
polynomials, each of degree 2n in p. The coefficients are throughout 
combinations of moments of sample observations. Inspection of (8.4) 
shows that the highest order terms will be of the form огур. Therefore 
substitution of the polynomials obtained from (8.3) in (8.4) will lead 
to à polynomial of degree (4n+1) in p. 

Although of high degree (17th degree in the case of our investment 
equation, which has four independent variables) these equations in p 
are not hard to solve numerically by iteration, because the relevant 
(and often the only real) root must lie in the interval 1z pz —1 at 
least for series where the end effect may be neglected. For if u satisfies 


u = p(u) v 
then 
Eu(u)., = РЕ) + Bow). 


Now (u).1 сап be expressed as а constant plus a linear combination 
of past values of v up to (v)-1. Hence if the v are mutually independent, - 


Ev(u)-ı must vanish and i 
(8.5) à e p 
E(u)’ 


which is the autocorrelation coefficient for ш in a long series. _ : 
Once (8.4) is solved for p, à; may be obtained by substituting 9 in 
8.3). « 78 

If the residuals in the structural equation are assumed to satisfy a 
second or higher order difference equation, estimation is a much more 
difficult matter, since simultaneous nonlinear equations must be solved. 


PROBLEMS OF COORDINATING THE UNITED STATES 
STATISTICAL SYSTEM 


* Sruarr A. Rice 
U. S. Bureau of the Budget 


HE statistical system of the United States embraces a great many 
[жы semi-official, and unofficial agencies апа instruments. To- 
gether these “comprise a system in the same sense that the activities 
of four and one-half million business units coinprise a national economic 
system. ”t Elsewhere I have defined the terms here discussed as follows: 
“A statistical system exists when coherence is established and main- 
tained among the separate programs that compose it. Such coherence 
requires an item-to-item adjustment of each task and process to every 
other related task and process, whether the relationship be one of con- 
ceptual congruity or one of consistency in operational patterns and 
sequences. The process of attaining and maintaining this coherence is 
called *co-ordination',"? 

The end sought is a better integration of our nation’s statistical in- 
telligence. Why is this desirable? I suggest two closely related reasons: 
First, an integrated system is more efficient. Second, it gives us a better 
understanding of the world in which we live. 

Our world is precariously balanced between forces which are further- 
ing advances in eivilization and others which are pushing us toward 
universal catastrophe. The balance among them can easily be tipped 
in one direction or the other. We cannot afford to blunder because of 
an inadequate understanding of the forces with which we deal. Nor can 
We afford to spend a single taxpayer's dollar to lesser advantage than 
we might, when the dentands upon our Government and upon the 
whole economy are so nuzférous and so pressing. If spent for statistics, 
е dollar should produce the greatest possible yield of useful informa- 

ion, 

The needs of users of statistics are seldom limited to a single series. 
For example, they may need to know simultaneously the facts about 
employment and production—not separately but in relationship. The 
data used „whatever their separate sources, should intermesh. Employ- 
ment and production series will not intermesh unless the definitions of 
reporting units correspond and unless they are grouped in accordance 


‘Stuart A, Rice, “The Role ; TAS ican 
Polis Гр Review, IRA quu ома of the Federal Statistical System,” The Americ 
bat ey, 2 „р. 481. 
dod d Co-ordination of Federal Statistical Flograms,": The American Journal of Sociolog, 1 
эр. 22, e 


‚488 


COORDINATING THE UNITED STATES STATISTICAL SYSTEM 439 


with the same industrial classification. Not so long ago, as the life of a 
bureaucrat goes, both “employment” and “industries” were separately 
and often inconsistently defined and classified by different Federal 
agencies. Hence, basic steps toward the integration of our system of 
statistical intelligence were taken with the development, promulgation, 
and incorporation into general use of the standard industrial classifica- 
tion. Other steps were the standardization of the types by status of 
persons included in the economically active population and the estab- 
lishment of uniformity in reporting periods for employment. 

The importance of such standards is illustrated in reverse when, in 
ignorance of them, investigators of economic, and social problems х 
formulate their own categories. Тоо late they may discover that their 
results are noncomparable with basic data, like those of the Bureau 
of the Census, with which to have significance they should be aligned. 

The relative merits of two general methods of integrating a statistical 
system have long been debated. One is administrative centralization, 
achieved with such conspicuous effectiveness in Canada. The other is 
decentralization, accompanied by coordination, exemplified by the 
statistical system of the United States. After many discussions of these 
assumed alternatives it is my judgment that the issues between them 
are largely unreal. In any event I disclaim partisanship. The statistical 
system of any country is an outgrowth of historical development. 
Within it are strains toward adaptation to the political, social, and 
economie structure which it serves and of which it is a part. 

The United States has developed an over-all pattern of statistical 
decentralization from which it is now too late to depart. Federal sta- 
tistical activities began to proliferate in a decentralized fashion im- 
mediately after the birth of the republic and the trend thus established 
has continued. The pattern is remarkably fidaptive to its milieu. The 
uniquely large volume of information produced by the statistical sys- 
tem of the United States reflects the factual-mindedness of our people. 
Statistics have been in demand and the demand could not have been 
satisfied so easily or so well except under the conditions provided by 
the historical decentralization of the nation’s statistical mechanisms. 

Other considerations support the efficacy for us of our own decen- 
tralized system. It permits a useful division of labor amorg agencies, 
public and private. It keeps the collection of many data in close contact 
with the uses to which they are to be put when assembled, thus giving 
protection and Assurance to administrators in the fields of utilization. 

Nevertheless, without a central mechanism for the,coordination of 
statistical activities the scene presented in а decentralized statistical 


© E " 


440 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1054 


system would be chaotic. In this country the central agency of statisti- 
cal coordination, known as the Office of Statistical Standards, is a part 
of the Bureau of the Budget in the Executive Office of the President, 
Its legal powers, affecting other Federal agencies, are more authorita- 
tive than it usually cares to assert; and its interests and concern, 
though not its legal authority, extend to numerous statistical activities 
which are under private direction. 

The limitation of the formal authorities of the Office of Statistical 
Standards to Federal agencies does not impede the integration of the 
nation’s statistical system. True, much statistical work is carried on by 
trade organizations and business units which are highly competitive, 
However, the dominance of the Federal Government in statistics and 
the prestige and influence of such Federal agencies as the Bureau of 
the Census and the Bureau of Labor Statistics are so great that busi- 
ness-produced statistics are coordinated informally to a very sub- 
stantial extent. 

Within the statistical system the problems of coordination are nu- 
merous and I shall have to offer a selection. 

1. The first is the problem of representing the general interest. It is 
naive to assume that governments are monolithic. The definite article 
before the noun—“the Government of the United States”—reflects an 
ideal that does not fully exist in actuality. The ensemble of Federal 
statistical agencies is not dissimilar to a trade association. The in- 
dividual Federal agencies, like the individual members of a trade 
organization for an industry, are competitive but recognize many 
common interests, To carry the analogy farther, the Office of Statistical 
Standards might be regarded as the secretariat of the trade association 
which serves the Federal statistical products industry. 

Customary procedures for financing and administering the separate 
members of this Federal sisftisties trade association tend to emphasize 
the autonomy of its members. The preparation of estimates of ex- 
penditures, their defense before the Bureau of the Budget and com- 
mittees of the Congress, and the administration of approved programs 
ee all responsibilities of the separate agency units to which funds are 

“specifically appropriated. These procedures make statistical coordina- 
tion more difficult, for an integrated statistical’ system should serve public 
and governmental interests which override or lie between the interests and 
responsibilities of particular agencies, 

Ж en when a Federal agency is persuaded to include in its estimates 
items of expenditure for purposes beyond its direct responsibilities, 
СОЕ Е inevitably the first to suffer when reductions are made, 


ES 


nis Gaa m - tu 


COORDINATING THE UNITED STATES STATISTICAL SYSTEM ~ 441 


whether in the budgeting, the appropriating, or the administering 
stages. The OSS recognizes an obligation to serve as а “public de- 
fender" for general interests in statistical data; but recognition of this 
obligation in the Congress and even within the Administration itself 
is very limited. Neither specific legal authorities nor the sanctions 
conveyed by eustom are present to lend weight to its appeals. When it 
has “gone to bat” in public for comprehensive objectives, as with the 
«reconversion statistiés program" of 1945, the consequences have not 
been encouraging. X 

The reasons are not far to seek: First, it is contrary to tradition for 
representatives of the Budget, Bureau to appear before committees of 
Congress on behalf of appropriations. Secondly, the Bureau's functions 
are properly inconspieuous and anonymous. They do not build up the 
type of publie support which, in the case of other agencies, is sometimes 
mobilized behind programs. 

Such considerations as these led the Mills-Long task force on sta- 
tisties to recommend to Mr. Herbert Hoover's first Commission on 
Organization of the Executive Branch of the Federal Government that 
the OSS be put in possession of free funds for disbursement at its dis- 
cretion on behalf of Federal statistical interests. This is not the place 
to outline the objections to this solution. 

A start has been made in another direction toward accomplishing 
the same objective. Each year the Office of Statistical Standards pre- 
pares a so-called “statistical budget,” recommending programs to be 
undertaken by the principal general-purpose statistical agencies: 
Bureau of the Census, Bureau of Labor Statistics, Bureau of Agricul- 
tural Economics, Office of Business Economics, National Office of Vital 
Statistics, and such joint enterprises as that on financial statistics of the 
Federal Trade Commission and the Securities and Exchange Com- 
mission. The purpose is to secure bver-all balance and thus achieve 
a Federal rather than a series of unrelated departmental programs. 
This “statistical budget” is reviewed in the Bureau of the Budget like 
any other Federal proposal. 

The “general interest” has also been represented and furthered by 
the Office of Statistical Standards in various other ways. I have already 
mentioned some of the standard classifications and definitions that, it 
has developed cooperatively with the “operating” statistical agencies 
for the use of all of them. For the Joint Committee on the Economic 
Report of the Senate and House of Representatives and for the Council 


of Economie Advisers it has prepared analyses of the gaps in our na- 


tional arsenal of statistical information. It was our privilege some . 


D 
« 


442 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


years ago to initiate the development of the monthly publication 
Economic Indicators, now prepared for the Joint Committee by the 
Council; and we have just completed the technieal work upon a supple- 
ment which interprets the sourees from which the indicators are 
derived. Our “Blue Book" on Statistical Services of the United States 
Government has had wide national and international circulation. Our 
Federal Statistical Directory and our monthly Statistical Reporter have 
been useful instruments of statistical integratioh within the Govern- 
ment. Not least important in this list are the facilities we offer for con- 
sultation upon the general interest by the various Federal statistical 
agencies, coming together at our invitation upon neutral ground. 

2. Certain consequential problems are involved in the procedure of 
developing a statistical budget. The separate items it brings together 
must also find a place in the estimates of expenditure submitted and 
supported by the respective departments and agencies which will ad- 
minister the funds. Hence the statistical items must be adjusted to 
other items within the departmental budgets concerned. Occasions have 
arisen in which the Bureau of the Budget has differed with the head 
of a department or agency concerning priorities among the statistical 
and other items in his budget. There have been cases in which funds 
approved by the Bureau for statistical purposes have been diverted 
by administrators to other purposes which they deemed more im- 
portant. Their actions could not easily be challenged without violating 
the sound principlé that the responsibilities of an administrator should 
be accompanied by command of the resources given him. 
| 3. Almost inseparable from the problem of deferiding the general 
interest is that of securing balance among particular agency and other 
interests of which a coordinated statistical program must take account. 
Each Federal agency wishes to collect data for which it feels an ad- 
ministrative need or for which thére is a legal or public demand im- 
posed upon it. However, through processes of coordination, a single in- 
quiry may often be made to serve additional purposes. The data pro- 
duced can sometimes supply the essential needs of other agencies as 
well. The original agency is not necessarily made happy by this pros- 
pect. It is exposed to the danger that its own purposes may be inade- 
quately or'belatedly advanced. fi 

Dangers to the “other agencies” are also present. The relationships 
established when one agency renders service to another may introduce 
seeming exceptions to the principle that responsibility should be linked 
with command The relationships acquire a contractual character, 
especially if a regular flow of data früm the servicing agency to its 


ee — 


COORDINATING THE UNITED STATES STATISTICAL SYSTEM 443 


“customer” agency is desired. In scheduling production the administra- 
tor of the first must resist the frequent temptation to subordinate the 
interests of the second. There is perhaps no greater impediment to 
statistical integration than the fear that if important statistical work 
is contracted out to another organization it will lose priority. 

The attainment of optimum balance in an omnibus program, giving 
appropriate weight to each separate interest and to the general welfare, 
is one of the most délicate problems of statistical coordination. 

4, Another difficult problem is that of establishing demarcation 
lines between governmental and nongovernmental responsibilities for 
the collection of statistics. The governing principle is clearly that data 
should be collected at the expense of government only when vested 
with public interest. This formula shares the simplicity of the well 
known secret of money making in the stock market—buying low and 
selling high. The difficulties appear in the application of the principle. 
Few statistical series on any subject are without some public im- 
portance. Even a strictly private interest, if shared by а sufficient 
number of people, becomes of public interest through its implications 
for the economy. Aids to farmers in marketing their crops provide 
examples. To what figure would the number of beneficiaries have to 
be reduced for the “public” interest in statistical estimates of crop 
production to become strictly “private”? Ten thousand? One thousand? 
One hundred? Ten? Or the single producer? 

In historical fact, conceptions of what may or may not be appropriate 
undertakings by government are under constant revision. I have tried 
for many years ‘to find some magic formula by which to apply the 
criterion of public interest to governmental statistical activities. I 
conclude that there is no general criterion and that the conception 
must be applied to individual situations as they arise, instance by in- 
stance. ® Tr 

5. The extent to which the protection of confidentiality" should 
be thrown around industrial or company data supplied to an agency 
of government raises almost equally difficult problems of discrimina- 
tion. These evoke the emotions of respondents; they provoke disputa- 
tion between exponents of monistic and pluralistic conceptions of gov- 
ernment; and they produce headaches for a statistical cOordinating 
agency. 

'i Perhaps I am guilty of some bias in favor of the pluralistic concep- 
tion when I say that once again the governing principle is clear. This 
is that data supplied to an agency of government for statistical pur- 
poses should not be allowed, ‘through disclosure, to cause individual 


° 
© * 


444 j AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1054 


hardship or disadvantage. It should not be used to support legal action 
against the respondent in the courts. It should not fall into the hands 
of business competitors who would find therein a competitive advan- 
tage. 

By becoming an unreasoned fetish this wholesome principle has 
actually worked in the past to considerable disadvantage to both 
government and suppliers of statistical information. For many years 
two Federal statistical agencies, each legally émpowered to collect 
the same information from the same respondents, believed themselves 
unable to share this information with each other and thus avoid the 
necessity of duplicating their inquiries of the public. 

"This situation was mitigated by certain of the provisions of the 
Federal Reports Act of 1942; yet the “fetish” remains as an impediment 
to many practical steps that would otherwise assist in the integration 
of Federal statistical activities. In a recent instance agency A sought 
to avoid the dilemma by asking its business respondents to supply in- 
formation previously given to agency B, in accordance with specifica- 
tions copied by A from those of B. Still more recently it was demon- 
strated that some millions of dollars might be saved in conducting 
the (ill-fated) Census of Business for 1953 by utilizing the information 
reported on income tax returns by retailers having no employes. The 
procedures proposed would have provided full protection against dis- 
closure to competitors or to Federal agencies other than the two 
directly concerned: When the proposal was considered by an advisory 
group of business consultants it was initially viewed with a skepticism 
approaching horror. Gradually the motive of economy prevailed and 
its adoption was recommended. 

Eneroachments by governments upón the liberties of individuals 
pose one of the great politico-ethical problems of our day. Federal 
statisticians and respondents alike are handicapped by the absence 
of a clear understanding and agreement upon the limits of *confi- 
dentiality” that should be attached to statistical returns. 

6. Analogous and sometimes related questions are raised by the 
needs of government to withhold from its own citizens certain statisti- 
cal data which, if they reached potential enemies, would give aid and 
comfort tè the latter. Since these questions are not inherent in the 
processes of statistical coordination per se, I shall leave them aside 
with a single footnote reference.’ 

7. Lastly I mention the need for liaison between Federal statistical 


= 
* Stuart А, Rice and Joseph W. Kappel, ‘Strategie Intellizi m E 
The American Political Science Rovian, XLV (roi) E and the Publication of Statistics 


E 


RDINATING THE UNITED STATES STATISTICAL SYSTEM — 445 


pencies and the statistical profession. How can we in the Federal 
overnment best consult with our nongovernmental colleagues? How 
we obtain their advice upon the issues we face—advice which is at 
same time technically competent, and fully informed respecting 
“the setting in which the issues arise? The last condition is essential if - 
idvice given us is to be realistic. If the condition is met, the prepara- 
ions for advice-giving must of necessity be very time-consuming, 
oth for advisors and advisees. 
Tn the lucid and “sobering” analysis of the responsibilities that have 
een placed upon Federal statistics in connection with the nation’s 
ctical affairs, presented in her notable Presidential address at 
hicago in December, 1952, Mrs. Wickens’ grappled courageously 
th this thorny problem. She felt that the time had come “for the 
profession as a whole to share some responsibility for these statistics 
th those who make them.” She therefore proposed “that there be 
f created a new United States Statistical Commission, with responsibil- 
ity for audit of statistical series, similar to an accounting audit, em- 
powered to put a ‘certified’ label on a statistical product. It should 
also be charged with investigation of methods, scope, and suitability 
of statistics, and with making recommendations for future improve- 
"ments and developmental work. ... Primarily, its membership would 
be drawn from experts outside government. . . - It should be a con- 
L'tinuing body, serving on occasion as required, but with a small full- 
time staff, and adequate financing, 80 that our-most distinguished 
statisticians, economists, scientists, and other specialists could réason- 
ably be expected to devote time and attention to its work. . . .” 

Mrs. Wickens’ proposal excited admiration for its boldness and 
breadth. It remains my opinion that the objectives which she visualized 
¢an be realized in the present only piecémeal and on а much more 
4 modest scale. How would the Commission, as 8 “continuing body,” be 
= financed? If through Federai appropriations, how long could it escape 
_ the constricting influences that surround other Federal agencies? How 
"would it divide its time between its tec А 
апа prolonged struggles over appropriations, over personnel appoint- 
ments and security clearances? Presumably such an organization would 
‚ы have to operate under the full and specific authorities poseassed by the 
| Office of Statistical Standards; but, how would it become related to 
7 Such other existing Federal agencies with central functions and legal 
_ authorities asthe Council of Economic Advisers? 
= Journal of the American Statistical 


hnical functions and the annual 


ке 
“жщ * Aryness Joy Wickens, ‘Statistics and the Public Interest, 
~ Association, 48 (1953), pp. 1-14. y А 


€ 


446 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


Most difficult of all, where would it find the distinguished members 
who could devote the requisite time to the labors that are visualized? 
Whatever the competence of the individuals who compose them, high 
level committees and commissions need continuity of membership and 
long exposure to the problems upon which they are asked to advise, 
The inevitably complex problems of government statistical organiza- 
tion cannot be quickly grasped. The Commission would demand mem- 
bers having a high degree of competence and general experience, but 
such people are invariably very busy. 

I can vigorously applaud Mrs. Wickens’ ‘references to the “first 
steps” made by our profession “in these directions.” I feel great satis- 
faction in the services given to the Office of Statistical Standards by the 
Advisory Committee on Statistical Policy, created by the American 
Statistical Association at our request in 1951 and composed of mem- 
bers appointed from a distinguished panel of the Association’s present 
and past presidents and president-elect. Methodically, and without 
undue haste, the Committee has thought its way through the in- 
tricacies of a number of perplexing policy issues regarding which the 
Federal statistical services are entitled to look for leadership to the 
Office of Statistical Standards. These have involved some of the prob- 
lems that I have discussed above, including that pertaining to the 
confidentiality of individual returns and differentiation between the 
appropriate areas of governmental and nongovernmental agencies in 
the collection of data. 

Thé Committee is slowly approaching some of the other tasks which 
Mrs. Wickens proposed for the new Statistical Commission, such as 
“making recommendations for future improvements and develop- 
mental work.” I cannot foresee the arrival of a time in its work when it 
could itself undertake “responsibility for audit of statistical series” 
or detailed investigations of “methods, scope and suitability” of Federal 
statistics. 

We should like to make it possible, through suitable compensation 
for the services it renders, for the Advisory Committee on Statistical 
Policy to render even greater service in the future than it has in the 
past. We do believe that it should avoid becoming entangled in small 
issues апа һа it should be able to sift out for its attention policy 
questions of the highest priority; for we do not want a “captive com- 
mittee” that can be presented at regular intervals with a long list of 
activities for its approval. 


Other types of advice are needed and received by the Office of 


Statistical Standards from other sources. Intra-governmental com- 


» » 


COORDINATING THE UNITED STATES STATISTICAL SYSTEM 447 


mittees and conferences of representatives of Federal statistical 
agencies are meeting daily upon specific questions of coordination. The 
Advisory Council on Federal Reports, representing the world of in- 
dustry and business, operates with its own budget and secretariat, 
bringing to bear upon us the viewpoints and interests of respondents 
upon the procedural problems which arise in the Federal collection of 
data. Similarly, the Labor Advisory Committee on Statistics presents 
to us the needs for Government figures of an important segment of 
the statistics-consuming public. 

Mrs. Wickens closed her memorable address with the expression of 
belief that “As a profession, statisticians must organize to meet this 
challenge if statistics are to continue to be administered in the public 
interest.” Despite many set-backs and discouragements, and in ways 
less pretentious than by the specific device she proposed, I believe they 
are doing so. 


GROWTH BY MERGER* 


С. Warren NUTTER 
Yale University 


IKE the business cycle in the recent past, industrial organization is 
Е а topic more talked about than investigated. Every new empirical 
study will be devoured by the fact-hungry specialist in this field, in the 
hope that it will answer some of the long list of unsettled questions, 
Professor Weston’s recent book on mergers will surely be avidly read, 
but it will not relieve the hunger as much as one would hope. 

The book deals with many things, although about half of it is con- 
cerned with the main topic, namely, the role mergers have played in 
the absolute and relative growth of large firms. The remaining half dis- 
cusses the particulars of the most recent merger movement, the theory 
of mergers, and problems of economic policy. This review will concen- 
trate on the main topic: it is of greater interest to the general economist 
than the others and it lies less completely outside my areas of com- 
petence. The remainder of the book is certainly worthy of attention; 
it is slighted here because I find it less controversial and should have 
little to contribute on it in any event, in view of my bare nodding 
acquaintance with the history of mergers. The chapter on mergers of 
the 1940’s is a useful and interesting summary of the more reliable 
studies of this movement. The chapter on the theory of mergers, though 
it contains some doubtful conclusions on the motives behind them, 
sheds new light on factors explaining the timing of merger movements. 
The chapter on economic policy adds few original suggestions but sum- 
marizes the problems rather well. I leave it to experts on the history of 
mergers to examine the details of these chapters. 


THE CENTRAL ISSUE 


AsI read the history of economic controversy, the central issue about 
mergers is whether industrial concentration and corporate giantism are 
in any significant degree traceable to them. There-are two other sub- 
Sidiary issues: whether concentration and bigness are the usual results 
of mergers; and whether they are the primary’ goals. These are all im- 
portant issues, but they must be recognized as distinct. Though the 
point will not be developed further, it can be noted that Weston does 
not always avoid mixing them up.! 

tos Asia Hee tile Sve Weston) Tha Bole of Mates ia the Growth af Lae Firms Gailey 


and Los Angeles: Universi of California Press, 1953). } i, 18 
le y ). Pp. xvi, 159. $3.50. 
1 See particularly the discussion in his bosk on pp, 34-37 and 51-57. 


y 
| 


GROWTH BY MERGER 449 


The first issue is of importance because an answer to it may cast light 
on the bases of monopoly (or, to use а more widely accepted word, 
oligopoly) and big business. Theory tells us that a firm may achieve a 
dominant position in an industry for any of three broad reasons: (a) it 
may have genuine economies of scale relative to the market it operates 
in; (b) it may have “artificial” economies, such as patents; or (c) it may 
be able to fare better than average over the long haul through tempo- 
rary exploitations of its dominant position. It is virtually impossible to 
discover the comparative importance of these factors by direct search- 
ing for causes; we cannot, for instance, measure economies of scale. 
Hence we must rely on inferences from other, more indirect, evidence. 
One relevant matter is the way in which firms have been put together. 
It is at this point that the question of mergers enters. If a dominant 
firm has chosen to achieve a significant part of its growth through 
mergers, doubt is cast on the importance of economies of scale, cer- 
tainly as they might derive from plant operations. The greater the ex- 
tent to which dominance has been maintained through mergers, the less 
likely that economies of scale are a general basis of monopoly (or oligop- 
oly), particularly if the typical picture js one of a continual struggle to 
retain dominance in the face of continual encroachment by other firms. 

The link between analysis of mergers and bigness is similar. Without, 
of course, exhausting all possible reasons for corporate giants, one may 
suppose that the most important are advantages of monopolistic posi- 
tion and economies of producing multiple products. Here again the 
method and nature of growth helps us to discover reasons, if only by 
process of elimination. А 

It is hard to know if Weston shares this view of underlying issues. 
In the opening sentences of his book he does suggest that interest in 
mergers arises from concern over broader questions, but he does not get 
down to details. He formulates the empfiical problem in very general 
terms, apparently in order to make the findings applicable to а wide 
range of specific issues. “The sources, extent, and consequences of big- 
ness and concentration have been widely disputed,” he says, and “many 
questions remain unsettled.” He goes on to list some of them and con- 
cludes that, “although the formation of appropriate public policies 
awaits further study ofthese and related issues, OUr understanding of | 
the nature and appropriate role of large firms may be increased by more 
complete information concerning the process of their growth” (7, р. 1]. 

There is littie to dispute here, if we substitute "relevant" for “com- 
plete,” find out what is relevant, and frame the empirical problem ac- 
cordingly. Instead of using this approach, however, Weston looks di- 


450 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


rectly to economic literature and concludes that “mergers are often 
cited as the major source of economic concentration” [7, p. 2]. Hence he 
sets this up as the empirical question to be resolved. 

Some economists have undoubtedly made statements like this,? but 
it is doubtful that such statements reflect the heart of the controversy 
over the role of mergers. In any event, Weston’s framing of the empiri- 
cal question is unfortunate; for, as his later analysis reveals, he is trying 
to find out whether mergers account for more or less than half of the 
growth of “oligopolistic” firms, a matter of limited relevance to the 
underlying issues as I see them. The real question is whether the pattern 
of concentration and bigness as we now observe it would have emerged 
in the absence of mergers; that is, whether mergers have accounted for 
significant portion of growth. There would seem to be no reason for 
choosing the fraction one-half as the dividing line between significance 
and insignificance. Yet he does so, explicitly as well as implicitly, For 
example, he characterizes mergers as a negligible factor in growth when 
he finds that they have accounted on the average for a fourth to a third 
of the absolute growth of firms studied [7, p. 30]. 

It would do Weston’s work an injustice to say that his data bearin 
no way on central issues; on the contrary, there is much here with direct 
bearing, available for the first time. I lament that there is not more, for 
there could have been with a slight change of orientation. Weston has 
collected certain types of data. He has used only a portion of these, 
summarized in ways best suited to his objectives. He has not given 
much attention to the problem of ordering the basic data in a flexible 
manner, a defect made more acute by failure to publish many of them. 
All of which is to say that it is very difficult to make much additional 
analysis of the data presented. On the other hand, something can be 
said about Weston’s findings, which is the task we turn to next. 


2 
ABSOLUTE GROWTH OF FIRMS 


А Weston selects 22 census industries with highly concentrated output 
in 1935 and studies the growth of 74 of their dominant firms? beginning 
` in the earliest year for which data could be found in each case and end- 


same as Weston’s statement. 


* He explains the basis for selecting i i e 
1 м g industries [7, pp. 112-21] but not firms, other than to say t 
PAM e Gominant.in the selected industries, Important firms are omitted from several industries 
2 ic ety, ammunition, ink, sewing mzzhines, ard cement. One is led to suppose tha 
ease in obtaining data played a part in selection of the samgle. 


GROWTH BY MERGER 451 


ing in 1948. Growth is measured by accretion in value of total assets.‘ 
Growth by merger, or “external” growth, is defined as the accretion 
resulting directly from all types of acquisitions of already existing 
firms; the residual is viewed as “internal” growth, that is, as growth by 
means other than merger. The relative importance of external growth 
is measured by the fraction of total growth accounted for by mergers, 
which we shall call *proportional growth by merger." Because of con- 
ceptual and empirical: difficulties in tracing external growth to its be- 
ginnings, Weston develops three measures, differing from each other 
in their treatments of assets in the earliest year for which he could find 
data. In the first measure the initial assets are regarded as external 
growth, that is, as resulting entirely from mergers; in the second they 
are excluded from both external and total growth, only subsequent 
growth being measured; in the third they are excluded from both ex- 
ternal and internal growth but not from total growth—that is, they are 
considered a separate component of growth. The first measure gives the 
highest estimate of proportional growth by merger, the third gives the 
lowest, and the second gives an intermediate estimate. 

Weston shows no preference for one measure over the other, and one 
must admit that it is difficult to choose among them, The first would 
be clearly the best if the initial assets of the largest firm taking part in 
the earliest merger were not counted as external growth, though there 
are conceptual problems here, too. As the measures stand, one can say 
that the second and third, taken by themselves, are more misleading 
than the first, because their understatement magnifies a strong down- 
ward bias that is present for another reason, discussed below. It is my 
guess that this bias overwhelms all others; therefore, at least for judging 
the over-all importance of mergers, the first measure would seem to be 
least bad, on the grounds that it probably minimizes a general down- 
ward bias. „ЖА 

Like all empirical workers, Weston was faced with a host of measure- 
ment problems, none having à wholly satisfactory solution. He dis- 
cusses most of these quite adequately. For instance, he devotes an ap- 
pendix to the question whether total assets are 8 better index of size 
than any component of assets. He divides the basic data on firms into 
three groups, on the basis of the reliability that he attributescto them. 


* Growth of firms in the steel industry is also sometimes mess Bp ee ee sea 
capacity [7, pp. 22-23 and 132-34]. The reason for this measure, and for selected use of it (not always 
called to one’s attention), is not explained. i 

© Weston does something similar to this in an igolated appendix table (1, Pl Js 
по part in his analysis. He does not explain why the procedure was followed Here or why it was not 
adopted as a general practice. 6 


452 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


According to his judgments, data are “accurate” (were fully confirmed 
by companies) for firms accounting for about a third of aggregate as- 
sets in 1948; “dependable” (partly confirmed) for firms accounting for 
about a half; and “questionable” (unconfirmed) for firms accounting 
for about a sixth. In general, he segregates findings for each group so 
that some allowance may be made for the relative shortcomings of 
basic information. Finally, he points out that his measures of external 
growth cover only that part of growth directly attributable to mergers; 
they do not take account of indirect effects. He argues that any effort 
to summarize both direct and indirect effects would lead to unanswer- 
able questions about how growth would have gone in the absence of 
mergers. In some cases it might have gone worse, in others better. This 
view seems reasonable enough in terms of strict logic, and examples can 
be found of both effects. Nevertheless, one wonders whether there is not 
in general a presumptive advantage in mergers, at least as far as rela- 
tive growth is concerned. Otherwise, why do firms continue to engage 
in them?* 

In view of the attention paid to some of these problems, it is strange 
to find no mention of one that probably overweighs all others: how to 
deflate the value of assets to allow for changing price levels. Even if we 
waive all other troublesome problems connected with measuring growth 
of capital (which is in itself hardly permissible), we are not justified in 
taking a dollar's worth of assets as representing the same real value in, 
say, both 1900 and 1948; some adjustment must be made for differ- 
ences in price levels, By Weston’s procedures, total growth is repre- 
sented by the value of assets recorded in 1948.7 Thus it is measured in 
terms of recent prices. External growth, on the other hand, is repre- 
sented by assets valued at the times mergers occurred. Since there has 
been a secular rise in relevant price levels, proportional growth by 
merger has been understateti in this respect. Moreover, the understate- 
ment is probably of considerable magnitude; for, in spite of the under- 
statement, Weston’s data on all firms as а group show that a large share 
of growth by Merger occurred in early years: 8 per cent before 1911, 
21 per cent before 1921, 71 per cent before 1931, and 89 per cent before 
1941 [7, pp. 155-56]. Unfortunately, the deflation problem is formi- 

, Qable.* There is simply по easy way to determine the degree of under- 


* Weston suggests other reason, 
But they seem to boil down toa list 
t This is reduced in the second 


а for mergers, applying primarily to recent ones [7, pp. 70-75]. 
SA bne in which mergers have an advantage over internal expansion. 
measures by the value of initial assets. 
| г : : 
ee jen a have been partly avoided, or at least some notion of the likely bias ee 
ave been gotten, growth in terms of 5 ion i Бо 
with no knowledge of the empirical difficulties шуо б р m ome i a 


e) 


GROWTH BY MERGER 453 


statement, particularly on the basis of the data presented. The best we 
can do is to keep this qualification in mind when examining Weston’s 
findings. i 

For the firms as а group, external growth accounted for 33, 22, and 
19 per cent of total growth, under the three measures used. The highest 
fraction occurs of course when initial assets are counted as external 
growth. For each measure the fraction is significantly higher for firms 
with “questionable” data than for other firms.? It is difficult to know 
how to interpret this, however, since the direction of error in the former 
is unknown. There appears to be no general relation between propor- 
tional growth by merger and size of firm; unweighted and weighted 
means are close together. At the same time, there is a wide dispersion 
among firms. For instance, when initial assets are counted as growth by 
merger, proportional growth by merger ranges from less than 8 per cent 
(Reynolds Tobacco) to almost 85 per cent (B. F. Goodrich). The me- 
dian is about 30 per cent. 

The same dispersion is apparent among industries. When the 74 
firms are squeezed into 22 industrial categories, proportional growth by 
merger, expressed as the weighted mean for firms in each industry, 
ranges from 8 per cent for aluminum to almost 70 per cent for cement, 
ammunition, and steel.!? The median is about 36 per cent. The data by 
industry are summarized in the table belaw. It might be noted that 
Weston does not give the weighted means in his tabular presentation 
[7, p. 22], and in some cases these differ rather markedly from un- 
weighted means (for instance, asphalted-felt floor coverings, photo- 
graphic apparatas, and dairy products). He also uses а figure for steel 
derived from growth of ingot steel capacity, а figure that is much lower 
than the one derived from growth of total assets. 

It must be repeated that all figures are reduced if Weston’s other two 
measures are used, some quite drastically. For instance, if initial assets 
are excluded from consideration, ammunition tumbles from the top of 
the list of industries (70 per cent) down to the bottom (2 per cent). 

What are we to conclude from this evidence? Let us see first what 
Weston concludes. “It appears,” he says immediately following the 
presentation of evidence, “that as a group, and irrespective of measure- 

= ee А Др 


9 Weston considers the differences as “not great.” Whether great ec not, they must be taken as 
significant, running as follows: 41 per cent as compared with 36 per cent (for firms with "dependable 
dath) and 26 per cent (for firms with "accurate" data); 32 as compared with 23 and 18; and 27 as com- 
d with 19 and 16. 
10 The figures for cement and ammunition are probably not very meaningful. Only two cement 
oftipenies ан covered, one (Lone Btar) accountitg for 7 per Gent ol output ip 1945 ie other (Ideal) 
3 per cent [7, p. 40]. Only one ar:munition'eompany (Remington Arme) is covered, 
4 But see n. 10 above. ^ : 


454 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


ment assumptions, the firms studied achieved the major extent of their 
growth through internal development. The proportion of growth 
through external acquisitions, however, is appreciable” [7, p. 15]. This 
at one point; but later he says something quite different. In his sum- 


PROPORTIONAL GROWTH BY MERGER IN 22 INDUSTRIES* 


Unweighted Weighted 


Industry Mean> Mean* ees 

(90) (90) i 

Ammunition? 70 70 f 
Cements 69 69 67-72 
Steel 64t 67 47-77 
Compressed and liquefied gases 52 52 51-52 
Asphalted-felt floor coverings 51 42 34-68 
Typewriters and parts 51 54 8-74 
Photographic apparatuse 49 28 12-85 
Dairy products 48 62 14-82 
Corn syrup, sugar, oil 44 49 15-66 
Tin canse 41 39 34-47 
Liquors 40 39 “27-59 
Rubber tires 38 33 5-85 
Meat packing 30 33 11-60 
Petroleum refining 28 27 11-44 
Cigarettes T 25 21 3-58 
Inke 22 23 18-26 
"Rayon and allied products 22 31 10-39 
Electrical machinerye 21 20 17-25 
Motor vehicles 21 18 · 6-35 

Sewing machines? 17 17 а 
Agricultural implements 14 19 8-25 
Aluminum 10 8 7-14 


> 


* Initial assets counted as growth by merger. © 
E EEN mean a Peruen tages for firms in each industry. Source: [7, p. 22]. 
‘or all firms in each industry Vus 
npn M ав а group, aggregate growth by merger as a percentage of to! 
4 Only one firm covered, 
* Only two firms covered. 


1 А 
Weston gives 53 per cent, a figure based on growth of ingot steel capacity expressed in tons. 


ming up atthe end of the chapter on absolute growth he states: “Ac- 
quisitions have been a negligible portion of the total growth of most of 
the firms In census industries now characterized by a high degree of 
concentration in output. . . . The direct effect of mergers on the absolute 
size of large firms appears to have been small” [7, p. 30]. And in his 
final summing up: “The extent to which, individual firms have grown 


MATT 


GROWTH BY MERGER 455 


by acquisitions varies greatly, but external growth is a relatively minor 
fraction of the total growth of most of the firms” [7, p. 101]. There is а 
rather wide gulf between “appreciable,” on the one hand, and “negligi- 
ble,” “small,” and “relatively minor,” on the other. There is little doubt 
from the general tone of his argument and widely scattered references 
that Weston really considers mergers а negligible source of growth. Yet 
his temporary wavering in the other direction is significant; in fact, it is 
the key for understanding why Weston adopts his final conclusion. 

The point is simply this: As was said earlier, the question Weston 
sets out to examine is whether mergers have been “the major source of 
economic concentration.” If “the major source” is understood to mean 
“responsible for substantially more than half,” fractions below a half 
become minor, or small or negligible. Tt is now clear that Weston does 
conceive of the issue in these terms. Otherwise, why his conclusions? 
It is difficult to see any other grounds for calling a fraction of 33 per 
cent (or 22 or 19 per cent) negligible. 

The impressive thing, to me at least, is the height of the fractions not 
their depth—particularly when account is taken of their probable 
downward bias and of the vast growth of the economy over the last half 
century, which leads us to expect internal growth to swamp growth by 
merger. Growth by merger has certainly been important enough, in а 
sizable group of the industries studied, to cast serious doubt on some of 
the explanations advanced for industrial concentration and abnormally 
large size. * 

Whatever conclusions are drawn, they should be considered ténta- 
tive. Although this study makes a useful contribution, much remains 
to be done. Tt would be useful, for instance, to know the role played by 
mergers in industries that were highly concentrated around the turn of 
the century but are no longer; and in indüstries with continually low 
concentration. It would also be useful to Study the changing importance 
of mergers over time. Weston unfortunately provides little information 
on this matter; the data presented by him are limited to broad sweeps 
of time ending in 1948, not broken down into subintervals. Finally; 
learning about the role of mergers in absolute growth only starts us on 
the way to learning about the role in relative growth. This leads us into 
the next topic discussed by Weston. a 


RELATIVE GROWTH OF FIRMS 
Weston operis the discussion of relative growth by characterizing the 


trend in industrial concentration as а movement from partial monopoly 
around the turn of the century«to oligopoly in recent times. The period 


« 
© 


456 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


of partial monopoly (dominance of an industry by a single firm) is in 
turn described as a temporary diversion from a historically typical pat- 
tern of oligopoly (multiple dominance), as far as concentrated indus- 
tries are concerned. That diversion he attributes almost entirely to the 
. merger moyement around the turn of the century. That is, growth by 
merger is in his view the primary explanation for increased concentra- 
tion in specific industries in the 1890's and early 1900's. Later develop- 
ments, he asserts, are a different matter: “acquisitions subsequent to 
the early merger movement have had relatively small effects on con- 

centration" [7, p. 48]. 

Before we proceed to the core of his argument, a digressive comment 
` en the early merger movement should be made. Weston’s description 
of the result as a general trend from multiple to single dominance of 
industries implies that market areas remained essentially fixed during 
the latter part of the nineteenth century. This was, however, a period in 
which the truly national market was emerging; localized markets were 
much more prevalent in the years leading up to the mergers than there- 
after. It is therefore not unreasonable to view mergers as a force coun- 
teracting expansion of markets, and hence as leading to less change in 
the structure of dominance than is usually believed, This point is raised 
. hot to refute Weston’s argument, but rather to shift the line of argu- 
ment in a rather obvious way. The point is this: The early merger 
movement is important in the history of industrial concentration be- 
cause it made consentration, taken in a relevant sense, greater than it 
otherwise would have been, irrespective of whether it actually in- 
creased concentration or not. 

Now, when we come to developments after 1904, the primary issue 
must be put in a similar way: Have mergers played an important role 
in making the pattern of concentration significantly different from what 
it otherwise would have beeii? Weston raises this question, but only 
after he is far along in his discussion ; and then it is put as more or Jess 
. subordinate to two other, narrower, issues. First, he wants to know 

whether the trend from single to multiple dominance can be attributed 

toa change in the nature and motivation of mergers. This question is 
essentially part of a running argument with Professor Stigler, on which 
commeiivwill be deferred until later, Secondly, he wants to know 
whether mergers since 1904 have been accompanied by increased con- 
centration, He concludes that they have not. It is in qualification of 
‘this conclusion that he raises the central issue, namely, “decreases in 


eo might have been even Jarger in the absence of mergers” 
‚р. Йй ү 


GROWTH BY MERGER 457 


In order to understand Weston’s analysis of the central issue, we are 
almost forced to follow his rather wandering path to it. As he sets out 
to trace the relation between mergers and trends in industrial concen- 
tration, he is immediately confronted with the frustrating lack of re- 
liable historical data on industrial concentration. Every one who has 
struggled with this problem will feel sympathy for Weston. Concen- 
tration ratios can be compiled from the Census of Manufacturers from 
1929 to date; many already have been. However, ratios for individual ' 
firms are not available even here, since the data are grouped for no 
fewer than the four largest firms in each industry. Moreover, ratios for 
different years are not strictly comparable because of different defini- 
tions of industries. A fairly large fund of information can be gathered 
for the turn of the century, much of it broken down by individual 
firms: but its reliability is doubtful, to say the least. The investigator 
is left to his own devices in constructing time series; he must search 
trade journals, isolated monographs, and so on. The difficulties can 
scarcely be exaggerated. 

Weston pulls together estimates of each leading firm’s share of out- 
put in 9 of his original 22 industries, covering selected years over the 
last half century. The estimates for 5 industries (motor vehicles, steel, 
cigarettes, aluminum, and cement) are as reliable as one can expect, 
though they could be more complete. On the,other hand, the estimates 
for the remaining 4 industries (electrical machinery, meat packing, 
rubber tires, and tin cans) are built on a very shaky foundation, and it 
doubtful that much significance can be attached to them. In these lat- 
ter cases, each firm’s share of output is taken to be the same as the ratio 
of its sales of all products to the value of products for the census 
industry in which that firm can be classified. The size of probable 
errors under this procedure is so large as te destroy the significance of 
all but very large differences in shaxes af different dates? It must be 
granted that census value of products for an industry, when computed 
on an establishment basis, includes the value of some products not 
classified in the industry; but it would be highly unlikely that errors 
here and in sales data for firms would be compensating, among firms 
and between firm and industry. In a footnote to his table Weston rec- 
ognizes that the data “are not strictly comparable,” but hc believes 


1 Bee, e.g., [2, pp. 129-40]. 
1 Some indication of possible errors is provided by comparing the combined share of the four 


leading firme for around 1985 as derived from Weston'a estimates, in some cases PY interpolation, with 
the combined share for 1935 computed from census data. For meat packing, the two are 30 and 50 per 
cent, respectively; for rubber tires, 73 and 81 per cent; for tin cans, 87 (two firms), and 80 (four firms) 
w pus and for electrical machinery, 37 (two firms) and 44 (four firms) per cent, [7, pp. 40-41 


458 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


that “the percentages through time indicate very roughly trends in 
occupancy of the market" [7, p. 40]. Even this limited conclusion is 
open to serious doubt. Moreover, in his analysis he treats changes in 
shares of output, measured in this way, as having much more accuracy 
than would be possessed by rough indicators of trends. 

Weston's first step is to examine developments in each of these 9 
industries. Let us foeus on those for which data on concentration can 
be considered reliable. In the case of motor vehicles, he notes a rather 
steady secular rise, with temporary ups and downs, in the dominance 
of General Motors, being achieved since 1921 largely at the expense of 
Ford's share of the market. The gains of General Motors are, he says, 
not to be attributed to mergers because “the emergence of General 
Motors Corp. as a leader in the industry came many years after the 
consolidating operations from 1911 to 1920 under W. C. Durant" [7, 
p. 36]. This is a strange conclusion in several respects. First, Durant 
was in and out of control over General Motors in the period from 1908 
through 1920; from 1910 to 1915, while out, he built up the Chevrolet 
Motor Company into a threatening rival [6, pp. 419—429]. After the 
two were merged in 1915, through financial manipulations by Durant, 
General Motors’ share of the market rose substantially [6, p. 27]. Sec- 
ond, the spectacular rise of General Motors occurred in the immedi- 
ately following decade of the twenties, after the serious financial prob- 
lems inherited from Durant's regime had been solved [6, p. 27]. This 
development canaot be read from the information Weston presents, 
bechuse he does not give data for the period between 1921 and 1937. 
Third, merger activity was by no means stopped after 1920 [6, рр. 
428-29]. 

Another important development hidden by Weston's presentation of 
data is the rapid rise of Chrysler in the late twenties and throughout 
the thirties, a rise attributable indarge measure to mergers. Chrysler's 
share of the market rose steadily (except for around 1935) from 3 per 
cent in 1925 to 23 per cent in 1937 [6, p. 27]. 

For the steel industry he notes a steady decline in the combined 
Share of the four leading producers from 1901 through 1920, an in- 
crease through 1930, and a slight decline thereafter. The rise in the 
twentiés*he attributes to growth by merger. He says that “since 1930, 
however, despite continued merger activities, market occupancy of the 
largest four has decreased slightly" [7, p. 38]. This statement is mis- 
leading for two reasons. First, the decline in the combined share of the 
four (or five) lergest firms is barely perceptible, running at 0.2 of a per- 
centage point, well within the range of:computational error alone. Sec- 


2 


— —Á ЕРТ нан 


a 


3 


GROWTH BY MERGER 459 


ond, the shares of Republic and Bethlehem, taken individually, in- 
creased by a significant amount; and they were the firms with greatest 
proportional growth by merger over this period. Their combined in- 
crease was matched by the decline for U. 8. Steel, which had virtually 
no merger activity in this period. 

He describes the trend in the cigarette industry as similar to that 
in steel. In this case, however, the pattern seems to be one of a general 
secular decline in the share of each of the four leaders through 1939, 
with the rise of Philip Morris in the thirties complicating the picture. 
Tf the data are carried through 1949 (which is not done by Weston), the 
pattern is changed to the extent that the largest producer, American 
Tobacco, regains most of the share it lost between 1912 and 1939 [5, 
p. 94]. 

The two remaining industries with reliable data on concentration are 
aluminum and cement. Weston rightly describes aluminum as a special 
case of decreasing concentration resulting from disposal of wartime- 
created capacity. We are given few data on concentration in the cement 
industry, covering only the span from 1929 through 1945. We are given 
even fewer data on absolute growth by merger." Hence no conclusions 
can be drawn about this industry, and Weston does not draw any. 

According to Weston, the data for these industries, and the other 
four not discussed here, “suggest that concentration in industries has 
not generally been increased by mergers since 1904. In the majority of 
industries for which information is available, decreases in concentration 
have actually occurred since 1904 despite merger activity” [7, p. 42]. 
In a sense Weston is certainly right: the share of the largest firm in 
most industries has declined, and the combined share of the two, three, 
or four largest firms has also. But this is not really relevant; the im- 
portant question is whether the declines octurred for firms with high 
or low proportional growth by merger. Phe following table shows а 
rather consistent relation between declines and relatively low propor- 
tional growth by merger, and between increases and relatively high pro- 
portional growth by merger. This conclusion is not vitiated if industries 
with questionable data are included. For every industry except rubber 
tires and possibly meat packing, relatively low proportional growth by 
merger is associated with declines in share of the market. Fóf every 
industry except possibly meat packing and cigarettes (one firm only), 
relatively high proportional growth by merger is associated with in- 
creases in share ‘of the market. 

Weston’s conclusion—namely,, that concentration фав decreased 

< 


4 See n. 10 above, t d 
è 


. 460 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


RELATION OF MERGERS TO CHANGES IN INDUSTRIAL | 
CONCENTRATION, 1904-1948 


: 7 ub Change in 95 Share of 
Industry and Firm Merger Share of Output, 1948» 
Output? (%) 
(%) 
Motor vehicles b^ А 
Chrysler 35 + 22 
General Motors 20 zb 44 
Ford 5 E 20 
Steel 
Republic 77 + 9 
Jones and Laughlin 56 ? 5 
Bethlehem 47 T 15 
U. S. Steel 10 (75) = 33 
Cigarettes? 
American Tobacco 13 (34) - 31 
Philip Morris 12 + 9 
Lorillard 6 (58) - 5 
Reynolds 3 - 26 = 
Liggett and Myers 0.1 (16) - 20 
Aluminum 
Reynolds 14 + 33 
Permanente ї 9 + 17 
Alcoa "ED (7) e 50. 
Electrical machinery 
. Westinghouse 25 +? 15? 
General Electric 14 =? 25? 
Meat packing? . 
Armour 60 0? 12? 
Wilson 35 02 4? 
Swift k 13 229 13? 
Cudahy 25411 0? 3? 
Ribber tirese Pee 
Goodrich 85 +? 14? 
U. 8. Rubber 47 (55) 0? 21? 
Goodyear 6 +? 25? 
Firestone 5 +? 24? 
Tin canst 
Continental Can 47 +2 25? 
‚ American Can 4 (34) caren 51? 


shown in parentheses. Source: (7, appendix E 
« Bouree: [7, pp. 20-41]; for the cigarette indust-y, also [8 p. эй]. 
date is 1949, е 
3 Terminal date is 1939, 
° Terminal date is 1947, 


GROWTH BY MERGER 461 


since 1904 despite mergers—leads him to а second line of inquiry. He 
raises the possibility that "decreases in concentration might have been 
even larger in the absence of mergers" [7, p. 44]. He embarks at this 
point on a statistical analysis that I am not sure I fully understand. 
My explanation must be given with the warning that it may not be an 
accurate representation of what Weston is trying to do. 

We may get at his approach by considering how information on pro- 
portional growth by merger might be married to information on output 
concentration. Let us suppose that, in the absence of mergers, the 
growth of a firm would have been smaller by exactly the amount of 
assets acquired by mergers. Let us further suppose that a firm's output 
grows in the same percentage as its assets; that is, a doubling of assets 
leads to a doubling of output. Finally, let us suppose that the firms 
acquired by merger had retained their separate identities and had not 
grown. These are heroic assumptions, subject to all kinds of qualifica- 
tion; but they can perhaps serve as working hypotheses for deriving a 
first approximation, which will almost surely be an underestimate, of 
the amount of concentration attributable to mergers. If they are ac- 
cepted on this basis, it follows that the fraction of a firm’s share of 
output attributable to mergers is measured by the fraction of its 
growth directly accounted for by mergers. For instance, by this reason- 
ing 77 percent of Republic’s share of steel output would be attributable 
to mergers, since 77 per cent of its growth is directly accounted for by 
mergers; instead of producing 9 per cent of steel output in 1948, it 
would have produced only 2 per cent if none of the mergers had taken 
place. . 

In principle the effect of mergers over any desired period could be 
estimated in this way, by measuring proportional growth by merger 
over that period alone. We cannot do this with the statistics worked up 
by Weston, however; for his measures of proportional growth by 
merger differ only in their treatments of assets in the initial years for 
which he found data, years that vary widely among firms [7, p. 11]. By 
extensive reworking of his basic data we might be able to eliminate all 
mergers before some particular date, but this would be a major job. 
Except for that possibility, the only thing that can be done is to esti- 
mate the effect of mergers over the entire life of firms. To do this one 
needs to look no further than the measures of proportional growth by 
merger, with initial assets counted as growth by merger; that is, all one 
needs is the evidence Weston has developed on the importance of 
mergers in absolute growth. Since mergers have genezally accounted 
for substantial fractions of tht absolute growth of the firms studied, 
it follows that they also account for substantial fractions of the firms’ 


B 


462 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


shares of output. Hence they have caused the pattern of concentration 
to be significantly different from what it otherwise would have been, 

Weston looks at the problem somewhat differently. He tries to relate 
quantitative changes in concentration with growth by merger, each oc- 
curring over restricted time periods; in addition, he tries to measure the 
fraction of a firm's share of output around 1948 that is attributable to 
mergers occurring over a restricted time period. It is hard to see what 
his findings can be taken to mean. First, the period of mergers will only 
rarely coincide with the period of changes in concentration, since the 
initial date in each case is simply the earliest year for which pertinent 
data could be found, in the one case on assets, in the other on share of 
output. Second, periods will vary widely among firms. Third, the 
estimates of shares of output do not have the accuracy required by the 
analysis. Finally, his sample has dwindled to 25 firms, whose identities 
are not revealed. 

There would seem to be little reason for reviewing his findings, but 
it may be appropriate to say a few more words about his general 
method. The assumptions underlying his approach seem to be those 
outlined above, though he does not state them explicitly, and they are 
not easy to unravel from his explanation of procedures. Perhaps it is 
best to let the reader decide by having Weston speak for himself: 

Of the several possible techniques for measuring the influence of internal 
expansion on existing levels of concentration, the following appeared to be 
most useful. First, the market occupancy percentage of the leading firms in 

«ап industry for the earliest year for which data are available was calculated. 
Second, data on external growth were deducted from absolute amounts of 
output or total assets of individual firms, but not for the industry as a whole. 
Third, the adjusted data of output or total assets for the firms were used to 
calculate market occupany percentages of individual firms which would 
have obtained if the growth of the firms had occurred entirely by internal 
expansion. Fourth, the adjusted goncentration ratios were compared with 
the concentration ratios of the earliest period to measure the extent of 
present concentration due to internal growth... [7, p. 44]. 

+++ Concentration ratios which would have existed in the absence of ex- 
ternal growth subsequent to the initial year for which data could be secured 
for individual firms . . . are [next] deducted from concentration ratios exist- 
ing in 1947, to provide measures of the extent to which present concentration 


is accounted for by acquisitions which took place after the early merger 
Movement. [7, p. 46.] 


This is the entire explanation. There is no further clarification of 
how time periods va; ; how proportional growth by merger is meas- 
ured; how assets acquired by merger were “deducted from absolute 
amounts of output”; how data on congentration of assets were com- 


| 
3 


P^ di 


GROWTH BY MERGER 463 


piled, and in what cases they were used; what firms and industries are 
covered in the sample; and so on. 

One must regretfully conclude that Weston’s discussion of the role 
played by mergers in the development of recent patterns of industrial 
concentration adds little to the knowledge already implicit in his evi- 
dence on the importance of mergers in absolute growth. Some new 
light is shed on developments in three industries, but these make up 
only a tiny sample of All industries, and much was known about them 
before. He does not provide the evidence needed to back up his con- 
tention that mergers have had very little to do with changes in in- 
dustrial concentration since 1904. 


WESTON versus STIGLER 


An interesting sidelight to Weston’s book is a running argument 
with Stigler over the relation between mergers and industrial con- 
centration.» It deserves mention because of its strong influence on the 
course of Weston’s enquiry. 

The primary source of dispute is a paper by Stigler [4]. Weston sees 
in this paper several controversial theses, which he puts as follows: 
(a) “mergers have been the major factor causing the development of 
bigness and concentration in the American economy” [7, p. 7); (5 
mergers around the turn of the century were motivated solely by а 
desire to achieve monopoly [7, p. 32]; (c) *mergers have been the main 
instruments by which partial monopolies have beeri transformed into 
oligopolies” [7, p. 36]; and (d) survival is the “test of relative efficiéncy 
among firms of different sizes” [7, рр. 64—65]. 

Now exegesis is a tricky business, and often a fruitless one; there is 
little to be gained here by prolonged laboring over what Stigler “really” 
said. Allow me, however, to offer some brief interpretations of Stigler’s 
views on these points, and to add a few démments on some of Weston's 
replies. For the rest, let Stigler’s paper speak for itself. 

Stigler makes a remark in his introductory comments that is akin to 
the first thesis attributed to him. It is not repeated at any other point, 
in either the same or related form. Moreover, close examination reveals 
that the kinship is remote. This is what he says: “There are no large 
American companies that have not grown somewhat by merger, and 


15 There is another argument as well, which will not be discussed here. Tt deals with the importance 
Fa eee eran па hope of markets аон оа ОКАН E ил 
general position is that failure to do so makes ratios significantly higher than they should be [3, р. 7; 
also apparently in a letter to Weston]. Weston recomputes ratios for 1935 to include imports in total 
output and concludes that the ratios generally are not significantly reduced. Tt raid я ee = 
this dispute because part of it stems from an unpublished letter sent зр тон 
tents are alluded to by Weston. Ы 

< 


464 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


probably very few that have grown much by the alternative method 
of internal expansion"; to which is added the footnote: “Unless other- 
wise indicated, size of the firm is to be measured relative to the size 
of the industry” [4, p. 23]. I take this to mean that dominant firms 
would not generally have gotten that way in the absence of mergers. 
- Weston’s interpretation, which need not be repeated here, is quite 
different. 3 
Hard searching has not produced for me the second and third theses. 
The closest Stigler comes to them is the f ollowing statement, which is 
far away indeed: “We shall find it useful to divide this history [of 
- mergers] into two periods, in which monopoly and oligopoly, respec- 
tively, were the primary goals” [4, p. 27]. 
The last thesis is really there. His full statement is as follows: 

The comparative private costs of firms of various sizes can be measured 
in only one way: by ascertaining whether firms of the various sizes are able 
to survive in the industry. Survival is the only test of a firm’s ability to cope 
with all the problems: buying inputs, soothing laborers, finding customers, 
introducing new products and techniques, coping with fluctuations, evading 
regulations, etc. A cross-sectional study of the costs of inputs per unit of 
output in a given period measures only one facet of the firm’s efficiency and 
yields no conclusion on efficiency in the large. Conversely, if a firm of a given 
size survives, we may infer that its costs are equal to those of other sizes of 
firm, being neither less (or firms of this size would grow in number relative 


to the industry) nor more (or firms of this size would decline in number 
relative to the industry) [4, p. 26]. 


Weston examines this statement in his chapter on the theory of 
mergers, which we have not discussed. He disagrees on four grounds. 
First, firms classified in the same industry, as defined for instance by 
the Census, do not all produce the same products; in particular, large 
ones typically produce many products, while small ones typically 
specialize in a single product, frequently an accessory or a custom 
item. This is certainly true, and it is a proper warning against in- 
cautious use of data. But it does not contradict Stigler’s proposition. 
It might be added parenthetically that, even by taking full advantage 
of such empirical ambiguities, one is hard-pressed to name more than 
a handful of large industries in which there are not firms of widely 
varying size producing essentially the same product for essentially the 
same markets. Those who are sceptical should try. 

Second, small firms may be kept alive because dominant firms 
Spread a price “umbrella” over them ; that is, the difference between 


f 


A 


GROWTH BY MERGER 465 


firms account for a very small portion of output. Larger firms, if more 
efficient, should gradually displace smaller ones, the point made by 
Stigler. If they do not, the implication is that the cost advantage to 
the dominant firms, if any, must be unique (in the form of an economic 
rent), not available to large firms in general, 

Third, smaller firms might be satisfied with lower rates of return on 
investment. This is possible; but so might larger firms. In any event, 
does this make any difference in the general results? The fourth point 
is essentially a repetition of this one, suggesting in addition that smaller 
firms might be satisfied to have their assets undervalued. 

The most interesting thing about this controversy is the way in 
which Weston has misread Stigler, particularly on the first three points. 
In the paper under dispute, Stigler is really trying to find out, in part 
by recourse to history, why mergers have been so often used in prefer- 
ence to internal expansion as a means of achieving dominance in an 
industry; and why they have, since about 1904, contributed more 
toward oligopoly than toward monopoly. In answering these questions 
Stigler is led to a general theoretical explanation for industrial con- 
centration, deflating the importance of economies of scale and inflating 
the importance of temporary exploitation. Here are revealed some of 
the basic issues that motivate us to seek more data on mergers. If the 
data are to be relevant, the empirical questions must also be relevant. 
Somehow, by bare margins in some cases, Weston has failed to raise the 
relevant questions. 


CONCLUDING REMARKS 


This review, like most, has stressed a book’s vices and slighted its 
virtues, A few words are called for to help redress the imbalance. 

The book has resulted from a major reséarch undertaking. It offers - 
much information not before available, &nd specialists.in the field of 
industrial organization will surely want to exploit it fully. They will 
also want to make use of the likely rich source of data represented by 
Weston’s worksheets, obtainable from the Bureau of Business and 
Economic Research, University of California. ; 

At the same time, the reader must beware of blind acceptance of 
Weston’s conclusions and some of his analysis. The facts as I sée them 
do not support much of what Weston has to say, particularly about 
the influence of mergers on industrial concentration since 1904. In 
cases where théy do, the conclusions sometimes have a significance 
different from what Weston supposes. 4 

In brief, this book breaks tue ground well in an area where little 


D 
c 


466 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


comprehensive statistical work had been previously done. It is not the 
final word, but it is a welcome beginning. 


REFERENCES 


[1] Edwards, Corwin D., Maintaining Competition, New York, McGraw Hill, 
1949. 

[2] Nutter, G. Warren, The Extent of Enterprise Monopoly in the United States, 
1899-1989, Chicago, University of Chicago Press, 1951. 

[3] Stigler, George J., “The Extent and Bases of Monopoly," American Economic 
Review, XX XII, Supplement (1942), 1-22. 

[4] ; “Monopoly and Oligopoly by Merger," American Economic Review, 
Proceedings, XL (1950), 23-34. 

[5] Tennant, Richard B., The American Cigarette Industry, New Haven, Yale 
University Press, 1950. 

[6] U. S. Federal Trade Commission, Report on Motor Vehicle Industry, Washing- 
ton, Government Printing Office, 1939. 

[7] Weston, J. Fred, The Role of Mergers in the Growth of Large Firms, Berkeley 
and Los Angeles, University of California Press, 1953. 


Te 


v SPURIOUS CORRELATION: A CAUSAL INTERPRETATION* 


HERBERT A. SIMON 
Carnegie Institute of Technology 


To test whether a correlation between two variables is 
genuine or spurious, additional variables and equations must 
be introduced, and sufficient assumptions must be made to 
identify the parameters of this wider system. If the two origi- 
nal variables are causally related in the wider system, the 
correlation is “genuine.” 


ven in the first course in statistics, the slogan “Correlation is no 
E proof of causation!” is imprinted firmly in the mind of the aspiring 
statistician or social scientist. It is possible that he leaves the course 
(and many subsequent courses) with no very clear ideas as to what is 
proved by correlation, but he never ceases to be on guard against 
“spurious” correlation, that master of imposture who is always repre- 
senting himself as “true” correlation. 

The very distinction between “true” and “spurious” correlation ap- 
pears to imply that while correlation in general may be no proof of 
causation, “true” correlation does constitute such proof. If this is what 
is intended by the adjective “true,” are there any operational means 
for distinguishing between true correlations, which do imply causation, 
and spurious correlations, which do not? үз 

A generation or more ago, the concept of spurious correlation -was 
examined by a number of statisticians, and in particular by G. U. 
Yule [8]. More recently important contributions to our understanding 
of the phenomenon have been made by Hans Zeisel [9] and by Patricia 
L. Kendall and Paul F. Lazarsfeld [1]. Essentially, all these treatments 
deal with the three variable case—the alarification of the relation be- 
tween two variables by the introduction of a third. Generalizations 
ton variables are indicated but not examined in detail. 

i Meanwhile, the main stream of statistical research has been diverted 
into somewhat different (but closely related) directions by Frisch’s 
work on confluence analysis and the subsequent exploration of the 
"identification problem* and of “structural relations” at the hands of 
Haavelmo, Hurwiez, Koopmans, Marschak, and many others! This 
work has been carried on at a level of great generality. It has now 
reached a point where it can bé used to illuminate the concept of 
* I am indebted to Richard M. Cyert, Раш F. Lazarsfeld, Roy Radner, and T. C. Koopmans for 


ue comments on earlier drafts of this papér. 
See Koopmans (2) for a survey and references to the literature. 


467 : ° 


© 


468 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


spurious correlation in the three-variable case. The bridge from the 
identification problem to the problem of spurious correlation is built 
by constructing a precise and operationally meaningful definition of 
causality—or, more specifically, of causal ordering among variables in 
a model? 


1. STATEMENT OF THE PROBLEM 


How do we ordinarily make causal inferences from data on correla- 
tions? We begin with a set of observations of a pair of variables, x and 
y. We compute the coefficient of correlation, rzy, between the variables 
and whenever this coefficient is significantly different from zero we wish 
to know what we can conclude as to the causal relation between the 
two variables. If we are suspicious that the observed correlation may 
derive from “spurious” causes, we introduce a third variable, z, that, 
we conjecture, may account for this observed correlation. We next 
compute the partial correlation, rzy.., between z and y with z “held 
constant,” and compare this with the zero order correlation, т. If 
Tay-s 18 close to zero, while r;, is not, we conclude that either: (a) zisan 
intervening variable—the causal effect of z on y (or vice versa) operates 
through 2; or (b) the correlation between z and y results from the joint 
causal effect of г on both those variables, and hence this correlation is 
spurious. It will be noted that in case (a) we do not know whether the 
causal arrow should run from z to y or from y to x (via z in both cases); 
and in any event, the correlations do not tell us whether we have 
case (a) or case (b). 

The problem may be clarified by a pair of specific examples adapted 
from Zeisel.* ; 

I. The data consist of measurements of three variables in a number 
of groups of people: z is the. percentage of members of the group that is 
married, y is the average number of pounds of candy consumed per 
Dm | is the average age of members of the group. A 

ation, Ta, was observed between marital status 


* Simon (6) and (7). See also Orcutt (4) and (5). I sh: i i ing i i 
the caveat that; the concept of causal ordering pce UU р зен, 


+ Zeisel [9], it : 

example we LUE HER een fg the original source will show that in this and the following 

tion. mables from attributes to continuous variables for purposes of exposi- 
| 9 


SPURIOUS CORRELATION: A CAUSAL INTERPRETATION = 469 


and amount of candy consumed. But there was also a high (negative) 

correlation, ry, between candy consumption and age; and a high 
(positive) correlation, ггг, between marital status and age. However, 

when age was held constant, the correlation r4, between marital 
status and candy consumption was nearly zero. By our previous 
analysis, either age is an intervening variable between marital status 
and candy consumption; or the correlation between marital status 
and candy consumption is spurious, being a joint effect caused by the 
variation in age. “Common sense”—the nature of which we will want 
to examine below in detail—tells us that the latter explanation is the 
correct one. К 

IL. The data consist again of measurements of three variables in & 
number of groups of people: z is the percentage of female employees 
who are married, y is the average number of absences per week per 
employee, z is the average number of hours of housework performed 
per week per employee. A high (positive) correlation, ry, Was ob- 
served between marriage and absenteeism. However, when the amount 
of housework, z was held constant, the correlation r,.4 was virtually 
zero. In this case, by applying again some common sense notions about 
the direction of causation, we reach the conclusion that z is an inter- 
vening variable between z and y: that is, that, marriage results in a 
higher average amount of housework performed, and this, in turn, ш 
more absenteeism. 

Now what is bothersome about these two examples is that the sare 
statistical evidence, so far as the coefficients of correlation are con: 
cerned, has been used to reach entirely different conclusions in the tw« 
cases, In the first case we concluded that the correlation between 2 anc 
y was spurious; in the second case that there was а true causal relation- 
ship, mediated by the intervening variable 2. Clearly, it was not the 
statistical evidence, but the “common sense” assumptions added after: 
wards, that permitted us to draw these distinct conclusions. 


2, CAUSAL RELATIONS vul 

In investigating spurious correlation we аге interested in ies: 
whether the relation betaveen two variables persists OF disappears, e 
we introduce a third variable. Throughout this paper (asin all ordine 
correlation analyses) we will assume that the relations In question ar 
linear, and without loss of generality, that the variables are measure 


from their respective means. - 
> - 4 
4 Zeisel [9], pp. 191-92. s А 
‘ 


470 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


Now suppose we have a system of three variables whose behavior is 
determined by some set of linear mechanisms. In general we will need 
three mechanisms, each represented by an equation—three equations 
to determine the three variables. One such set of mechanisms would 
be that in which each of the variables directly influenced the other 
two. That is, in one equation z would appear as the dependent variable, 
y and z as independent variables; in the second equation y would ap- 
pear as the dependent variable, z and z as the independent variables; 
in the third equation, г as dependent variable, т and y as independent 
variables.’ $ 

The equations would look like this: 


(2.1) 2 + ayy + аӊг = ш, 
(I) (2.2) antt y + а = ш, 
(2.3) ant + ay + 2 = и, 


where the w’s are “error” terms that measure the net effects of all other 
variables (those not introduced explicitly) upon the system. We refer 
to А [|а| as the coeficient matriz of the system. 

Next, let us suppose that not all the variables directly influence all 
the others—that some independent variables are absent from some of 
the equations. This is equivalent to saying that some of the elements of 
the coefficient matrix are zero. By way of specific example, let us as- 
sume that аа =ав= ал =0. Then the equation system (I) reduces to: 


(2.4) 2 + aw + аш = ш, 
ш) es) у анн 
(2.6) Ў z= щ. 


By examining the equations (ID), we see that a change in us will 
change the value of z directly, and the values of z and y indirectly; 
a change in w will change у directly and т indirectly, but will leave 2 
unchanged; a change in v; will change only x. Then we may say that y 
T sod dependent on z in (II), and that z is causally dependent on ў 
ды ? rod were P gin we would say that the correlation was 

In the case of the system (II), f i 
dene ius y (П), for a0. Suppose, instead, that 


* The question ef how we distinguish between “ 
‘dependent” « Е, (5 is dis- 
cussed іп Simon (7), and will receive further attention pee Череп” vare 


SPURIOUS CORRELATION: A CAUSAL INTERPRETATION 471 
(2.7) caue, 

(III) (2.8) y T ane us 
(2.9) z = Us. 


In this case we would regard the correlation between z and y as 
spurious, because it is due solely to the influence of z on the variables 
т and y. Systems (II) and (III) are, of course, not the only possible 
cases, and we shall need to consider others later. 


3. THE @ priori ASSUMPTIONS 


We shall show that the decision that a partial correlation is or is not 
spurious (does not or does indicate a causal ordering) can in general 
only be reached if a priori assumptions are made that certain other 
causal relations do not hold among the variables. This is the meaning 
of the “common sense” assumptions mentioned earlier. Let us make 
this more precise. 

Apart from any statistical evidence, we are prepared to assert in 
the first example of Section 1 that the age of a person does not depend 
upon either his candy consumption or his marital status. Hence 2 can- 
not be causally dependent upon either т or y. This is a genuine empiri- 
cal assumption, since the variable “chronological age” really stands, 
in these equations, as a surrogate for physiological and sociological 
age. Nevertheless, it is an assumption that we are quite prepared to 
make on evidence apart from the statistics presented. Similarly; in 
the second example of Section 1, we are prepared to assert (on grounds 
of other empirical knowledge) that marital status is not causally de- 
pendent upon either amount of housework or absenteeism.’ N 

The need for such a priori assumption follows from considerations of 
elementary algebra. We have seen that whether a correlation is genuine 
or spurious depends on which of the coefficients, aij, of A are zero, and 
which are non-zero. But these coefficients are not observable nor are 


the “error” terms, из, us and из. What we observe is a sample of values 


of z, y, and z. 1 PSU 

Hence, from the standpoint of the problem of statistical estimation, 
we must regard the Зл sample values of т, 0, and z as numbers given by 
observation, and the 3n error terms, Ui; together with the six coeffi- 
cients, a;;, as variables to be estimated. But then we have (3n+6) 


" ¥ Fi Я ii |, we сап 
"Since these are empirical assumptions it is conceivable that they are wrong, ы ee Ча 


imagine mechanisms that would reverse the causal ordering in the becond e: " 
here is that these assumptions, right or wrong, ‘are implicit in the determination of whether the cor- 
relation is true or spurious. ж 

, 


* 


472 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


variables (3n w’s and six a’s) and only 3n equations (three for each 
sample point). Speaking roughly in “equation-counting” terms, we 
need six more equations, and we depend on the a priori assumptions 
to provide these additional relations. 

The a priori assumptions we commonly employ are of two kinds: 

(1) A priori assumptions that certain variables are not directly 
dependent on certain others. Sometimes such assumptions come from 
knowledge of the time sequence of events. That is, we make the general 
assumption about the world that if y precedes x in time, then a»; — 0— 
a does not directly influence y. 

(2) A priori assumptions that the errors are uncorrelated—i.e., that 
"all other" variables influencing z are uncorrelated with “all other" 
variables influencing y, and so on. Writing E (u;u;) for the expected 
пе of иси з, this gives us the three additional equations: 


E(uw)-0;  E(uu)-0; Е(ила) = 0. 


Again it must be emphasized that these assumptions are “a priori” 
only in the sense that they are not derived from the statistical data 
from which the correlations among z, y, and z are computed. The as- 
sumptions are clearly empirical. 

As a matter of fact, it is precisely because we are unwilling to make 
the analogous empirical assumptions in the two-variable case (the 
correlation between z and y alone) that the problem of spurious corre- 
lation arises at all: For consider the two-variable system: 


(3.1) a+ diy = 
р (8.2) у= 


We suppose that y precedes 2 in time, so that we are willing to set 
ba =0 by an assumption of type (1). Then, if we make the type (2) 
assumption that Æ(vw:) =0, we can immediately obtain a unique esti- 


mate of by. For multiplying the two equations, and taking expected 
values, we get: 


(333) Elzy) + bsE(y)) = Eww) = 0. 
Whence 
(3.4) hia ee 

. Ey?) C eu 


It follows immediately that (sampling questions aside) bız will be zero 
Ог non-zero as 712 is zero or non-zero. Hence correlation is proof of causa- 


P 
а) 


ig 


SPURIOUS CORRELATION: A CAUSAL INTERPRETATION = 473 


tion in the two-variable case if we are willing to make the assumptions 4 
time precedence and non-correlation of the error terms. н 


If we suspect the correlation to be spurious, we look for a common 
component, z, of v; and v; which might account for their correlation: 


(3.52) 9; = Ш — Qa, 
(3.5b) ELIO 


Substitution of these relations in (IV) brings us back immediately to 
systems like (II). This eubstitution replaces the unobservable v's by 
unobservable w’s. Hence, we are not relieved of the necessity of postu- 
lating independence of the errors. We are more willing to make these 
assumptions in the three-variable case because we have explicitly Te- 
moved from the error term the component z which we suspect is the 
source, if any, of the correlation of the v's. 

Stated otherwise, introduction of the third variable, z, to test the 
genuineness or spuriousness of the correlation between z and y, is a 
method for determining whether in fact the vs of the original two 
variable system were uncorrelated. But the test can be carried out 
only on the assumption that the unobservable error terms of the three 
variable system are uncorrelated. Tf we suspect this to be false, we must 
further enlarge the system by introduction of a fourth variable, and so _ 
on, until we obtain a system we are willing to regard as “complete” in 
this sense. . : 

Summarizing our analysis we conclude that: Г m 

(1) Our task js to determine which of the six off-diagonal matrix — 
coefficients in a system like (I) are zero. We 

(2) But we are confronted with a system containing a total of nine 
variables (six coefficients- and three unobservable errors), and only 
three equations. М меу : B 

(3) Hence we must obtain six more relations by making certain @ 
priori assumptions. 

(a) "Three of these relations may be obtained, from c 
time precedence of variables or analogous evidence, 
direct assumptions that three of the а;; are zero. . 1. f 

(b) Three more relations may be obtained by assuming thé érrors 
to be uncorrelated. 


onsiderations of 
in the form of 


X 
4, SPURIOUS CORRELATION 1 


Before proceeding with the algebra, it may be helpful to look а pu 
more closely at the matrix of aoéfficients in systems like (1), qn, i 
е е М 


474 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 
(III), disregarding the numerical values of the coefficients, but con- 
sidering only whether they are non-vanishing (X), or vanishing (0). 
An example of such a matrix would be 


250-0 
REKK 
On ON X 


In this case x and z both influence y, but not each other, and y in- 
fluences neither = пог z. Moreover, a change in uz—u; and из being 
constant—will change y, but not т or z; a change in u, will change т 
and y, but not z; a change in из will change z and y, but not x. Hence 
the causal ordering may be depicted thus: 


x 2 
NZ 
y 


In this case the correlation between x and y is “true,” and not 
spurious. ч 

Since there are six off-diagonal elements in the matrix, there are 
2°= 64 possible configurations of X's and 0%. The a priori assumptions 
(1), however, require 0’s in three specified cells, and hence for each 
such set of assumptions there are only 2: — 8 possible distinct configura- 
tions. If (to make a definite assumption) z does not depend on y, then 
there are three possible orderings of the variables (2, 2, y; 2, 2, V; $ V 
2), and consequently 3-8 —24 possible configurations, but these 24 con- 
figurations are not all distinct. For example, the one depicted above 
18 consistent with either the ordering (z, т, y) or the ordering (т, 2, 0). 

Still assuming that т does not depend on y, we will be interested, in 
partieular, in the following configurations: 


X 0 0 Xia Olek 0-0 
X Xx A53. 20 Hex 0 
ПЭ Я 0.0 x X 0X 
те (0) @) @ 
хох X 0 0 
OX aC П 5079. 
р. ОН 


(8) Bag (nns 


' 


SPURIOUS CORRELATION: A CAUSAL INTERPRETATION 475 


In Case a, either z may precede 2, or z, x. In Cases В and ô, 2 precedes 
т; in Cases y and e, x precedes z. The causal orderings that may be 
inferred are: 


x 2 Ї - X 2 i 
м. 2 и LN 2 
y | y z z y i 


(a) (8) Q) (8) (9 


. 

The two cases we were confronted with in our earlier examples of 
Section 1 were à and e, respectively. Hence, à is the case of spurious 
correlation due to z; e the case of true correlation with z as an inter- 
vening variable. 

We come now to the question of which of the matrices that are con- 
sistent with the assumed time precedence is the correct one. Suppose, 
for definiteness, that z precedes z, and x precedes y. Then аз= an = as 
=0; and the system (I) reduces to: 


(4.1) т dag, 
(4.2) Oa x + у + @ = ws 
(4.3) г = UW. 


Next, we assume the errors to be uncorrelated: 
(4.4) Е(ши) = Е(ши) = Е(иш) = 6. pe 


Multiplying equations (4.1) — (4.3) by pairs, and taking expected 


values we get: 


(4.5) auE(z*) +E (cy) --aaE(z2) А-аа [anf (22) +B (ye) -anE(?)] 


ne = E (uu) =0, 
(4.6) Е(хг) + аһЕ(@) = Е(шш) = 0, 
(4.7) ag E(zz) + E(yz) + anE(2) = Е(иш) = 0. 
Because of (4.7), the terms in the bracket of (4.5) vanish, giving: 
(4.8) ала") + Eley) + (80) = 0. , = 


Solving for E (zz), E (yz) and E (ту) we find: 


(4.9) А E(az) = — а (2), 
(4.10) E(yz) = (азам — as) (2), ° 
(4.11) E(zy) = бшавЕ(г®),— El’). 


476 AMERICAN eramtstioan ASSOCIATION JOURNAL, SEPTEMBER 1954 
Case o: Now in the matrix of case a, above, we have a:;=0. Hence: j 
(4.122) E(az) = 0; (4.12b)E(yz) = — aaE(z?), 
(4.120) E(zy) = — agE(?). 
Case В: In this case, a23=0, hence, 
(4.132) E(zxz) = — asE(2?); (4.13b) E(yz) = aaa H(z"); 
(4.13) _ E(zy) = — anE(2?); 
from which it also follows that: 
аш Bay) = Be) 200. 
| Case ô: In this case, an=0. Hence, 
(4152)  E(@z) = — asE(2?); (4.15b)E(yz) = — аһЕ(?); 
(4.150) (ау) = азаһЕ(2?); 


and we deduce also that: 
: _ E (sz) E (yz) | 
(4.16) E(zy) = DUNG 


We have now proved that аӊ =0 implies (4.122); that а» —0 implies 
(4.14); and that a; —0 implies (4.16). We shall show that the converse 
alsc’holds. 

_ To prove that (4.12a) implies a=0 we need only set the left-hand 
side of (4.9) equal to zero. 

To prove that (4.14) implies that а= 0 we substitute in (4.14) the 


values of the cross-products-from (4.9)-(4.11). After some simplifica- 
tion, we obtain: 9 


(4.17) anl E(t?) — ay2E(z*)] = 0. 
Now since, from (4.1) 
(4.18) E(x?) — Е(ш?) + 2asE(eu)) = а1?Е (2°), 


and since, by multiplying (4.3) by ш, we can show that E(zu;) =0, 
the second factor of (4.17) can vanish only in case E(w?) =0. Excluding 
this degenerate case, we conclude that 053 — 0. 
To prove that (4.16) implies that an=0, we procecd in a similar 
manner, obtaining: 


(4.19) an [E(z*) — autE(z)] = 0, 


SPURIOUS CORRELATION: A CAUSAL INTERPRETATION = — 477 


from which we can conclude that a — 0. 
We can summarize the results as follows: у 
1) If E(xz) 20, E(yz) 40, E(zy) 40, we һауе Casea -~ 
2) If none of the cross-products is zero, and 


E(yz) 


Е(ху) = E(2’) Fe)’ 


we have Case B. 
3) If none of the croas-products is zero, and 


_ Ee) Ep) 


E(zy) EG) 


Д 
we have Case 5. 
We can combine these conditions to find the conditions that two or 
more of the coefficients ais, azs, йа vanish: tl 
4) If аз=аљ=0, we find that: 
E(zz) =0, E(yz) =0. Call this Case (a). 
5) If аӊ=аз=0, we find that: 
E(zz) 20, E(ay) =0. Call this Case (аб). 
6) If a23=a21:=0, we find that: 
E(yz) =0, E(zy) =0. Call this Case (89). 
7) If i3=d23=a2=0, then 
E(zz) = E(yz) = E(zy) =0. Call this Case (o9). 
8) If none of the conditions (1)-(7) are satisfied, then all three com 
efficients as, аза, аз are non-zero. Thus, by observing which of the 


. conditions (1) through (8) are satisfied by the expected values of the _ 


cross products, we can determine what the causal ordering is of the 
variables,” o \ 
We can see also, from this analysis, why the vanishing of the partial | 
correlation of т and y is evidence for the spuriousness of the zero-order - 
correlation between z and y. For the numerator of the partial correla- - 
tion coefficient rsy.» we have: s A 


E(zy) Е(22)Е(ул 
(4.20 №) = ————— Е t. 
) Co) ЕЕ) DOVE = 
We see that the condition for Case ô is precisely that Tays Vanish 
while none of the coefficients, rz, Tes, ry, vanish. From this we conclude | 


"OF course, the expected values are not, strictly speaking, chsermblestosm Mt 
Sense. However, we do not wish to go integsafmpling questions here, and on e 
good estimates of the expected values. Ба : Eo 

} К 


. 
L] 


478 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


that the first illustrative example of Section 1 falls in Case ô, as pre- 
viously asserted. A similar analysis shows that the second illustrative 
example of Section 1 falls in Case e. 

In summary, our procedure for interpreting, by introduction of an 
additional variable z, the correlation between z and y consists in making 
the six a priori assumptions described earlier; estimating the expected 
values, E(zy), E(zz), and E(yz); and determining from their values 
which of the eight enumerated cases holds. Each case corresponds to a 
specified arrangement of zero and non-zero elements in the coefficient, 
matrix and hence to a definite causal ordering of the variables. 


5. THE CASE OF EXPERIMENTATION 


In sections (3)-(4) we have treated и, из and us as random variables. 
The causal ordering among z, y, and z can also be determined without 
a priori assumptions in the case where ш, us, and из are controlled by 
an experimenter. For simplicity of illustration we assume there is time 
precedence among the variables. Then the matrix is triangular, so that 
a; ;0 implies a ;;—0; and a;;7*0, а 5,740 implies a;;— 0. 

Under the given assumptions at least three of the off-diagonal a's in 
(I) must vanish, and the equations and variables can be reordered so 
that all the non-vanishing coefficients lie on or below the diagonal. If 
(with this ordering) м» or us are varied, at least the variable determined 
by the first equation will remain constant (since it depends only on 
ш). Similarly, if us is varied, the variables determined by the first and 
second equations will remain constant. 

In this way we discover which variables are determined by which 
equations. Further, if varying u; causes a particular variable other than 
the ith to change in value, this variable must be causally dependent on 
the ith. Р 

Suppose, for example, that variation in u; brings about a change in 
æ and y, variation in u a change in y, and variation in us a change in 
т, у, and z. Then we know that y is causally dependent upon z and 2; 
and z upon 2. But this is precisely the Case 8 treated previously under 
the assumption that the ws were stochastic variables. 


7 


6. CONCLUSION 


: In this paper I have tried to clarify the logical processes and assump- 
tions that are involved in the usual procedures for testing whether 
a correlation between two variables is true or spurious. These pro- 
cedures begin by imbedding the relation between the two variables in 
a larger three-variable system, that is ðssumed to be self-contained, 


y 


SPURIOUS CORRELATION: A CAUSAL INTERPRETATION 479 


except for stochastic disturbances or parameters controlled by an ex- 
perimenter. 

Since the coefficients in the three-variable system will not in general 
be identifiable, and since the determination of the causal ordering 
implies identifiability, the test for spuriousness of the correlation re- 
quires additional assumptions to be made. These assumptions are 
usually of two kinds. The first, ordinarily made explicit, are assump- 
tions that certain variables do not have a causal influence on certain 
others. These assumptions reduce the number of degrees of freedom of 
the system of coefficients by implying that three specified coefficients 
are zero. 

The second type of assumption, more often implicit than explicit, 
is that the random disturbances associated with the three-variable 
system are uncorrelated. This assumption gives us a sufficient number 
of additional restrictions to secure the identifiability of the remaining 
coefficients, and hence to determine the causal ordering of the variables, 


REFERENCES 


[1] Kendall, Patricia L., and Lazarsfeld, Paul F., “Problems of Survey Analysis,” 
in Merton and Lazarsfeld (eds.), Continuities in Social Research, The Free 
Press, 1950, 133-96. 

[2] Koopmans, Tjalling C., “Identification Problems in Economie Model Con- 
struction,” Econometrica 17: 125-44 (April 4949), reprinted as Chapter II 
in Studies in Econometric Methods, Cowles Commission Monograph 14. 

, “When Is an Equation System Complete for Statistical Purposes?” 
Chapter 17 in Statistical Inference in Dynamic Economic Models, Cewles 
Commission Monograph 10. 

[4] Orcutt, Guy H., "Toward Partial Redirection of Econometrics,” The Review 
of Economics and Statistics, 34 (1952), 195-213, 

, “Actions, Consequences, and Causal Relations,” The Review of 
Economics and Statistics, 34 (1952), 305—14* 

[6] Simon, Herbert A., “On the Definitjon €f the Causal Relation,” The Journal 
of Philosophy, 49 (1952), 517-28. Up». 

‚ “Causal Ordering and Identifiability.” Chapter III in Studies in 
Econometric Methods, Cowles Commission Monograph 14. è 

[8] Yule, G. Udny, An Introduction to the Theory of Statistics, Charles Griffin 
and Co., 10th ed., 1932, Chapters 4, 12. (Equivalent chapters will be found 
in all subsequent editions of Yule and Yule and Kendall, through the 14th.) 

9] Zeisel, Hans, Say It With Figures, New York, Harper and Brothers,-1047, 


[3] 


[5] 


[7] 


EMPIRICAL STUDY OF THE ACCURACY OF SELECTED 
METHODS OF PROJECTING STATE POPULATIONS* 


Heren В. Warre 
United States Bureau of Agricultural Economics 


As tentative guides in the preparation of population projec- 
tions for geographic subdivisions of the United States, the 
accuracy in the past of several methods of projecting popula- 
tion has been measured. These measures have been analyzed 
to some extent for information on the effects of selected factors 
other than methodology on the accuracy of projections. 


RRORS in particular population projections have been noted and 
analyzed in the literature of this field, and various methods of 
projecting population have been evaluated. Nevertheless, little has 
been written on the accuracy of population projections in general or 
about the effects of such factors as the size of the base population, past 
migration rates, and length of the projection period on the accuracy 
of population projections. The test described below was undertaken in 
order to provide some guides in deciding what methods should be used 
in projecting the populations of geographic subdivisions of the United 
States and in deciding whether any projections should be prepared in 
certain cases. Ў 
Design of test.—The study is based on a comparison of projections 
to 1940 and 1950 of the 1930 Census population, for each state and for 
the District of Columbia, prepared by various methods, with the 
Census data for those dates. Since the projections prepared by Thomp- 
son and Whelpton and published in Estimates of Future Population by 
States (National Resources Board, 1934) are based on the 1930 Census, 
these projections could be used to represent the cohort-survival 
method? ‘The other methods by which projections have been prepared 
are limited to those which did not involve extensive computing and 


iret Paper is based on а project of the Bureau of the Census carried on while the author was 
puede irons jm Assistance of Mrs. Beatrice M, Rosen of the Bureau of the Census is 
MUS RETE GBA саара О0О ing of the 
Population Association of America on April 19, 1952, this study was presented at a meeting o 
1 Bee Appendix. 
? The 'cohort-survival method involves making se E , 

m бы) separate allowances for changes.in each of its age 
peat tung from mortality and immigration; the initial size of cohorts born after the base date 
applications of this, Dus Age-specific or cohort birth rates. More detailed descriptions of specific 

od are given in: P. K. Whelpton, “An Empirical Method of Calculating 


» > 480 


| 
x 


METHODS OF PROJECTING STATE POPULATIONS "481 


for which the results would not be biased by the worker's knowledge 
of population trends since 1930. These methods are the geometric, 
arithmetic, apportionment, and ratio methods. 

Both the apportionment method and the ratio method require in- 
dependent projections of the total population of the United States for 
1940 and 1950. For this purpose, the sum of the Thompson-Whelpton 
projections for states, with an allowance for migration, and the actual 
decennial census national totals were used. Although the cohort- 
survival, geometric, and arithmetic methods do not necessarily involve 
the use of independent national totals, the latter two were also adjusted 
to the Thompson-Whelpton projections and all three were adjusted to 
the census national totals. 

The independent projection of the total population of the United 
States has been used as a control total (that is, the state projections 
have been forced to sum to the independent national projection) in the 
instances mentioned above. Such use is not an inherent characteristic 
of the ratio method but arises from the adjustment of the appropriate 
ratios to sum to 1.00. Projections were also obtained by one variation 
of the ratio method without adjustment of the ratios. These projections 
are referred to as “Ratio III (unadjusted)” in the text and are pre- 
sented in the tables under “Unadjusted to national total.” : 

As the preparation of projections is not, justified unless the results 
are better than those obtained by using the available current figures 
on the size of the population, measures of the errors involved in using 
the 1930 Census data for 1940 and 1950 have also been developed. TES | 
figures are presented in the various tables (under the designation of 
* Constant") along with the results for the various methods. 

Thus the tables present results for the following methods and as- 
sumptions: xj 

I. Unadjusted to national total. , © 

1. Cohort-survival, with migration, the cohort-survival method, as- 
suming continuation of internal migration like 1920-30 (see Estimates 
of Future Population by States, mentioned above, for information on the 
basic assumptions). 

2. Cohort-survival, no migration, the same as (1) above except for the 
assumption of no internal migration. ; St 

3. Geometric method, assuming the continuation of the 1920-30 

з В members forces overseas except 
those Кызу ө кыре нс. ннен рт, total Setter A because of the small 
number of armed forces involved. e 


482 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


4. Arithmetic method, assuming continuation of the 1920-30 average 
amount of increase per year. 

5. Constant, the 1930 enumerated population. 

6. Ratio III (1900 to 1980 modified), T & W national projection, the 
ratio method, using the rules presented in Current Population Reports, 
Series P-25, No. 56,4 for selecting the period used in computing the 
initial change in the ratio of the population of each division to the total 
population of the United States and the ratio of the population of each 
state to the population of its divisions; these rules, which in this case 
were applied to 1900-30, 1910-30, and 1920-30, first eliminate any 
period during which the given ratio did not either constantly increase 
or eonstantly decrease, and then select, from the remaining periods, 
the one for which the absolute value of the average annual rate of 
change in the given ratio was least. It was also assumed that the annual 
rate of change of each ratio would decrease linearly to zero within 
fifty years; i.e., by 1980. As these ratios were not adjusted to sum to 
1.00, the projected state populations do not sum to the Thompson- 
Whelpton national projection to which the ratios were applied. 

7. Ratio III (1900 to 1980 modified), census count, the same as (6) 
above except that the projected ratios were applied to the census na- 
tional totals. 

П. Adjusted to T & W national projection, using as a control total 
the independent T &W projection (with internal migration) of the total 
population of the United States. 

— 1: Geometric, the state projections of (I, 3) above adjusted propor- 
tionately to sum to the T & W national projection. . 

2. Arithmetic, the state projections of (I, 4) above adjusted pro- 
portionately to sum to the T & W national projection. 

8. Apportionment method; assuming (a) that the increase in the 
total population of the United States, as indicated by the T & W pro- 
jection, would be distributed in accordance with the distribution of the 
1920-30 increase among those states whose population gained between 
1920 and 1930 and assuming (b) that the populations of those states 
ш Sea there was a decrease during that period would remain con- 
stant. 

4. 2010 1, (1870 to 1980), the ratio method, involving the projection 
of the ratio of the total population of each state to the total population 
of the United States on the assumptions (a) that the initial change in 
the ratio would be the same as the 1870-1930 average annual rate of 


‘Helen L. White айа Jacob В. Siegel, “Projecti i t 
y b В. ions,of the Population by States: 1955 and 1960. 
Current Population Reports, Series P-25, No. 56, Bureau of the Census, January 1952. 


METHODS OF PROJECTING STATE POPULATIONS 483 


change in the ratio, and (b) that the annual rate of change in the ratio 
would decrease linearly to zero by 1975; the projected ratios were ad- 
justed to sum to 1.00 and were then applied to the T & W national 
projection. 

5. Ratio II (1980), the ratio method, assuming that the ratios 
would remain at the 1930 level; this assumes that the per cent increase 
in the population of each state would be the same as that of theT & W 
national projection te which the assumed ratios were applied. 

6. Ratio III (1900 to 1980 modified), using the same assumptions 
as for (I, 6). The ratios"were adjusted to sum to 1.00 and were then 
applied to the T & W national projection. ; 

III. Adjusted to census count, using the census count as a control 
total. 

1. Cohort-survival, with migration, the state projections of (I, 1) 
adjusted proportionately to sum to the census count of the total popu- 
lation. 

2. Cohort-survival, no migration, the state projections of (I, 2) ad- 
justed proportionately to sum to the census count. 

3. Geometric, the state projections of (I, 3) adjusted proportionately 
to sum to the census count. 

4. Arithmetic, the state projections of (I, 4) adjusted proportionately 
to sum to the census count. S 

5. Apportionment, assuming (a) that the actual increase in the total 
population of the United States, from the census cunt, would be dis- _ 
tributed in accordance with the distribution of the 1920-30 increase 
among those states which gained population during that period and 
(b) that the population of those states in which there was a decrease 
during that period would remain constant. 

6. Ratio I (1870 to 1980), applying the ratios of (11, 4) to the census 
count, x 

7. Ratio IT (1930), applying the ratios of (II, 5) to the census count. 

8. Ratio III (1900 to 1980 modified), using the same assumptions as 
for (I, 6). The ratios were adjusted to sum to 1.00 and were then ap- 
plied to the census count. 

As mentioned previóusly, the projections for 1940 and for 1950, for 
each state, deseribed above, were compared with the 1940 and 1950 
Census returns. Before the comparison was made, the 1950 Census 
data for each etate were adjusted to include members of the armed 
forces who resided in the given state at the time of induction and to 
exclude members of the armed ferces stationed in the given state who 


484 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


did not reside there at the time of induction.’ The deviations of the 
projections from the census data are summarized in Table 1, which 
shows the average per cent error (average of the absolute values of the 
per cent deviations), the maximum per cent error (absolute value), 
the proportion of errors of ten per cent or more (absolute values), and 
the proportion of positive errors (over-estimates). 

It must be kept in mind that the results presented here are only 
very rough guides for future periods. i 

Accuracy of various methods.—The various methods are evaluated on 
the basis of the summary measures shown in Table 1 for the unad- 
justed projections and the projections adjusted to the Thompson- 
Whelpton national projections. The projections adjusted to the census 
counts are not considered, since census data could not be used for this 
purpose in actual practice. 

One of the rather interesting results of this study is that the cohort- 
survival method does not appear to yield definitely superior results. 
(It must be remembered, however, that this method has several ad- 
vantages over other methods not wholly dependent on its absolute 
validity.) In fact, no one method is clearly superior to all other methods 
tested and only the Ratio I method is clearly inferior. The Ratio I 
method is probably inferior because the basic assumptions place too 
much emphasis on population change in relatively remote periods; none 
к ше other projections depend so much on population change prior to 

— “Жог 1940, the apportionment, the cohort-survival (with migration), 
the Ratio II, and the Ratio III (unadjusted) results are the best on 
the basis of average per cent error. On the basis of the proportion of 
errors of 10 per cent or more, the apportionment, the cohort-survival 
(with migration), and the Ratio III (both unadjusted and adjusted) 
results are the best. ^ 


For 1950, the arithmetic (unadjusted), the Ratio III (unadjusted), — * 


! The effecta of military migration during the past decade were removed because they are believed 
to represent an abnormality which ordinarily could not be taken into account by any of the methods 
of projecting population. Dr, Henry S. Shryock, Jr., has commented: “At first I thought that your 
кшш сеи Census data to what we sometimes call the de jure population level was the wrong thing 
io do here, The members of the armed forces stationed in the several states are the result of what may 
куд: ав military migration, Since we are usually interested in forecasting the number that the 
Обдаца will count at a future date, it would seem appropriate to include the armed forces where they 
Mates in 1080, Ge ee Furthermore, some members of the armed forces were stationed in the several 
рери On the other band, T realize what you are trying to do is to remove direct effecta of Шо 
tic ute preparations on the distribution of population among the states. This attempt is consistent with 
ila thew dase is ional projections to assume that a war will not be going on or be in prospect at 
the future dates for which projections are made. Of course, if the cold war continues long enough, We 
case might hesi ES resulting size and distribution of the armed forces as normal, and in this 

we hesitate to predict the future distribution'of population under peaceful conditions.” 


— «a 


Ё METHODS OF PROJECTING STATE POPULATIONS ^ 485 


| the cohort-survival (with migration), the apportionment, and Ње. 

› Ratio II results are the best оп the basis of average per cent error, each’ 
of these methods having an APE under 13. On the basis of the propor- 

TABLE 1—SUMMARY OF PER CENT ERRORS OF PROJECTIONS 

| TO 1940 AND 1950 OF THE POPULATIONS OF THE STATES, 

| FOR SELECTED METHODS M 


Proportion of | р, i 
errors of 10 2 


per cent ог more d 
(expressed ав К M 


a per cent) 


Method error error 


Unadjusted to national total 
Cohort-survival (T&W) 


я With migration 5.14 | 12.52 39.52 

No migration 6.04 | 15.11 45.82 
Geometric 8.19 | 18.81 47.40 
Arithmetic 6.30 | 10.91 38.38 
Constant 8.39 | 19.27 46.89 

T Ratio III (1900 to 1930 

| modified) 

4 T&W national projec- 

tion* 5.80 | 12.46 34.67 | 14.3 
| Census count 5.83 9.83 34.45 | 14.8 


Adjusted to T&W national 


projection* . 
Geometrio 7.60 | 17.17 37.10 | 30.6 | 75.5 
Arithmetic 6.06 | 14.03 34.23 | 24.5 | 57.1 
Apportionment 5.02 | 12.71 34.75 | 12.29] 58.1 
Ratio I (1870 to 1930) 15.98 31.75 276.38 | 44.9 83.7 
Ratio TI (1930) 5.24 | 12.80 40.11 | 20.4 | 53.1 
Ratio III (1900 to 1930 

36.00 | 16.8| 55.1 


modified) 
Adjusted to census count 
Cohort-survival (T&W) 


e 
H 
[3 
A 
5 
e 
* 


ысып 0 
With migration 5.16 | 10.54 | 22.75 | 5.69 | 12-2 | 38.8 | 38.8 | 49. 
No migration cst | 15.77 | 26.65 | 55.35 | 20.4] 51.0) 55-1 DH 
Geometric 764 | 13.43 | 25.14 | 31.82 | 30.6) 57-1) 347 ae 
Arithmetio 6.10 | 10.96 | 23.81 | 33.98 | 22.4] 449 | 94.7 mE 
Apportionment 5.06 | 10.57 | 22.70 | 33.601 | 12.2| 40.8 | 92.7 “4 
Ratio I (1870 to 1930) | 15.99 | 30.87 | 117.42 | 310.83 | 44.9] 67.8 e au 
Ratio II (1930) 5,234 | 12.59 | 21.26 | 34.68 | 18.4] 44.9) 46. ' 
Ratio IIT (1900 to 1930 d a 

modified) 6.18- | 10.28 | 23.68 | 32-54 | 16.3 | 40.8 | 94.7 36.7 

* With migration. e E 


tion of errors of 10 per cent or more, the Ratio ш (оваа) pe 
cohort-survival (with migration), and the arithmetic (unadjusted) pro- 
Jections are tht best. i i 
Apparently the cohort-survival (with migration), thoeppoitio e ah 
and the Ratio ITI (unadjusted) results make the consistently best show- 


486 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


ings. The differences between most of the various summary measures, 
however, are too small to justify any definite conclusions except with 
regard to the unfavorable showing of Ratio I. 

It does appear that projections prepared by any of the methods 
mentioned above as being consistently best will be better guides for 
ten and twenty years in the future than the most recently available 
census data or current estimates. Assuming that the state populations 
would be the same in 1950 as in 1930 yields an APE of 19; also, 74 
per cent of the “projections” are jn error by 10 per cent or more. These 
values are notably higher than those for the cohort-survival (with 
migration), the apportionment, and the Ratio III (unadjusted) 
methods. 

Control totals —It has generally been assumed that the best current 
estimates® of the populations of the states are obtained by adjusting 
the various estimates for the states to add to comparable estimates 
for the United States. If this is true of current estimates, it would not 
‘seem unreasonable to expect it to be true of projections. Hagood and 
Siegel made this assumption in preparing their article, “Projections 
of the Regional Distribution of the Population of the United States 
to 1975;" the method described by them involves adjusting the ap- 
propriate ratios to sum exactly to 1.00. 

The hypothesis that the use of independent control totals increases 
the accuracy of state projections can be tested by comparing the 
summary measures for the unadjusted projections with those for the 
adjusted projections, both those adjusted to the Thompson-Whelpton 
national totals and those adjusted to the census counts, for each 
method.: (The constant, apportionment, Ratio I, and Ratio II meth- 
ods cannot be included in this comparison, of course.) Although the 
results are somewhat inconclusive, the value of the use of control totals 
appears to be questionable. Еої 1950, the APE for each of the methods 
for which 8 comparison can be made, with the exception of the cohort- 
survival method (with migration), is somewhat lower for the unad- 
justed projections than for the adjusted projections. Even the ad- 
justed Ratio III projections, which involve the use of divisional con- 


trol totals, have a slightly higher APE than the unadjusted Ratio ш 
projections. z 


ҮКЕНҮ" F 
M NI e te а figure for a current date which usually does not depend on 
7 Margaret Jarman Hagood and Jacob S. Siegel, “Projections of the Regional Distribution of the 

шшс. of the United States to 1975,” Agricultural EPOR esol. ш (1951), рр. 41-52. 

2 As the base pop lations are free of error in comparison with the implicit projections of population 
nge, it con be argued that the adjustment of the projections should have been in proportion to tHe 

projected change in the population, or some other factot, rather than the projected populations. It 

may be desirable to include this alternative in future testa of the sort described here. 


> 


^ rns „чаш 


METHODS OF PROJECTING STATE POPULATIONS 487 


It is obvious, of course, that adjusting to the actual census count 
reduces the sum of the differences between the projections and the 
actual population to zero and, further, that if all states had the same 
percentage error, proportionate adjustment to the census count would 
eliminate all errors. However—and this is apparently an important 
however—the value of a proportionate adjustment depends on the 
accuracy of the control total and on thé distribution of the errors. If the 
errors are randomly distributed with regard to direction (that is, if the 
number of positive errors is approximately 50 per cent of all errors), 
then adjustments can only introduce a bias toward over-estimating 
or under-estimating. If all of the gross errors are positive or all are 
negative, and the errors in the projections for the remaining states 
are negligible, then proportionate adjustment will tend to decrease 
but not eliminate, the gross errors at the cost of increasing the errors 
in the projections for the remaining states. Thus, for 1950, adjusting 
introduced (or added to) a bias towards under-estimating in the Ratio 
III, geometric, and arithmetic methods. On the other hand, adjusting 
generally but not consistently yielded better results for 1950 in terms 
of the maximum per cent error and proportion of errors of ten per cent 
or more. 

The projections adjusted to the actual census count are generally 
better than those adjusted to a national projection, as might be ex- 
pected. 

Length of projection period—Inspection of Tableel shows that the 
projections for 1940, which involve a 10-year projection period, sre 
subject to smaller errors than the projections for 1950, which involve 
a 20-year projection period. For all of the methods combined, the АРЕ 
for 1940 is 7, and 22 per cent of projections are in error by 10 per cent 
or more; for 1950, these measures are 15 and 54 per cent, respectively. 
This should be a warning against thé claim sometimes made that 
population projections are satisfactory guides for the long-run trend 
even when they deviate in the short-run. | 

Size of population and migration rate.—1t seems worth while to in- 
vestigate the relation of the accuracy of the projections to the size 
of the base population and to the size of the migration rate charac- 
teristic of the area prior to the projection period. Correlation ббёй- 
cients and regression equations would yield the most useful measures 
of the relation between these items. However, the measures shown in 
Table 2 and in*Table 3 are indicative of the relations. Table 2 shows 
Summary measures of the errors in the 1950 projections for the 24 
States with populations of 1.88 million or more in 1930 and the 25 
states with populations of 1.85 million or less in 1930; Table 3 shows 


€ 
е . 


488 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


similar measures for the 24 states with 1920-30 average migration 
rates of 0.5 or less per year (regardless of the direction of the migration) 
and the 25 states with 1920-30 average migration rates of 0.6 or more 
per year.? 


TABLE 2—SUMMARY OF PER CENT ERRORS OF PROJECTIONS 
TO 1950 OF THE POPULATIONS OF THE STATES, FOR SELECTED 
METHODS, BY SIZE OF POPULATION IN 1930 


Proportion of 


a Proportion of. 
Average Maximum errors of 10 per Бой COE 
per cent per cent | cent or more (expromed ME 
error error (expressed as р " 
в per cent) *р ош 
Method 
24 25 24 25 
largest |smallest| largest [smallest 
states | states | states | states 
With migration 27.54 33.8 | 68.0| 12.5| 16.0 
^... Мо migration 45.32 45.8 | 04.0 | 37.5 | 32.0 
Geometric 47.40 50.0 | 56.0 | 70.8| 30.0 
Arithmetio 38.38 33.3 | 68.0 | 66.7 | 32.0 
Constant 46.02 70.8 | 76.0] 4.2| 12.0 
Ratio III (1900 to 1930 
'T&W national projeo- 
MORE 29.2 | 68.0 | 16.7 | 16.0 
Census count 16.7 | 56.0 | 62.5 | 28.0 
Adjusted to T&W national 
75.0 | 76.0 | 20.8| 40 
Arithmetic 37.5 | 76.0 | 16.7 | 12.0 
Apportionment 33.3 | 72.0 | 16.7 | 16.0 
Ratio I (1870 to 1930) 79.2 | 88.0 | 12.5 | 40.0 
ado II (1080) 37.5 | 68.0| 33.3 | 24.0 
Ratio Ш (1900 to 1930 
i ERE 37.5| 72.0| 8.3| 12.0 
Adjusted to census count 
Cohort-survival (T&W) 
With migration 35.69 | 33.98 | 20.8 | 56.0| 66.7 | 32.0 
No migration 51.80 | 55. 37.5 | 04.0| 66.7 | 52.0 
Geometric 31.28 | 31 45.8| 68.0 | 33.3 | 24.0 
vU 33.08 | 28. 16.7 | 72.0 | 54.2 | 24.0 
pppoe toomens 33.61 | 28.30 | 20.8 | 60.0| 58.3 | 24.0 
рано ERN 1930) 124,31 | 310.83 | 50.0 | 84.0 | 20.8 | 48.0 
e 33.56 | 34. Р .0| 70.8| 44.0 
Хало III (1900 to 1990 à eese 
ые), к 32.54 | 30.81| 16.7 | 64.0] 50.0] 24.0 
* With migration. 


Mea E EN Dur O1 ы DIETER . —=# 


Shryock, Jr. $ “Internal Migrati е values of the rates were used. They were obtained from Henry 8. 
(1948), pp. 16-30) and the War" (J-urnal of the American Statistical Association, 38 
Ü . 


"^ 1 


> 5 y 


D E 


METHODS OF PROJECTING STATE POPULATIONS M 489. 


Table 2 suggests definitely that, for a given method and а given 
length of projection period, the errors of projections tend to be larger 


TABLE 3.—-SUMMARY OF PER CENT ERRORS OF PROJECTIONS 
TO 1950 OF THE POPULATIONS OF THE STATES, FOR 
SELECTED METHODS, BY AVERAGE;MIGRATION 
RATE FOR 1920-1930 


Average 
percent 
error 
Method 
24 
states states 
with 
smallest | largest 
rates 
Unadjusted to national total 
Cohort-survival (T&W) 
With migration 11.07 
No migration 12.79 
Geometric 11.61 
Arithmetic 9.42 
Constant 19.24 
Ratio III (1900 to 1930 
modified) 
T&W national projec- 
tion* 10.99 
Ceneus count 8.43 
Adjusted to ТФ national 
projection* 
Geometric 15.15 37.10 | 75.0 
Arithmetic 11.76 33.48 | 50.0 
Apportionment 11.00 31.57 | 50.0 
Ratio I (1870 to 1930) | 24.91 276.88 | 79.2 
Ratio II (1930) 10.85 40.11 | 50.0 
Ratio III (1900 to 1930 . 
modified) 12.56 33.61 | 50.0 
Adjusted to census count 
Cohort-survival (T&W) 
With iie 8.93 | 12.09 20.72 | 29.2 
No migration 11.54 | 19.83 55.85 | 83.3 
Geometric 10.76 | 16.00 31.34 | 41.7 
Arithmetic 8.90 | 12.94 27.40 | 25.0 
Apportionment 8.68 | 12.39 26.86 | 20.8 
Ratio I (1870 to 1930) 22.58 | 38.82 310.83 | 58.3 
Ratio II (1930) 9.6 | 15.41 34.08 | 37.5 
Ratio III (1900 to 1930 
modified) 8.63 | 11.88 27.54 | 25.0 


* With migration, 
. 


i d be- 
as the size of the population on which the projectionseare based 
Comes smaller. Thus, ec all methods shown in Table 2, the APE of the 


490 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


projections for 1950 for the 24 states with the larger populations is 11, 
while the APE for the other 25 states is 19. Also, 39 per cent of the 
projections of the larger populations are in error by 10 per cent or 
more, while 68 per cent of the projections of the smaller populations 
are in error by 10 per cent or more. 

The errors also tend to be larger for the states with the larger average 
migration rates, according to Table 3. The APE for the 24 states with 
the smaller migration rates is 12, while the APE for the other states 
is 17. Errors of 10 per cent or more occur in 45 per cent of the projec- 
tions for the first 24 states and in 63 per cent of the projections for the 
second 25 states. 

As might be expected, the projections for the 14 states which are 
both among the 24 states with larger populations and among the 24 
states with the smaller migration rates, have a smaller average per cent 
error (10) and a smaller proportion (33 per cent) of errors of 10 per 
cent or more than either of the two complete groups. 

Need for additional research.—Even though the results of this test 
are inconclusive, they are probably of sufficient value and interest to 
warrant additional research along the same general lines. Other areas 
needing additional research are numerous. A few of these areas are men- 
tioned below. Failure to include the logistic method is one of the more 
obvious gaps of this study. The question of using controls at any level 
and the question of a single national control versus controls at the 
national and various intermediate levels, should be explored further. 

“fn the light of the results for the several ratio methods, the best balance 
in the basic assumptions between emphasis on long-run trend and 

. emphasis on recent experience, should be investigated. In connection 
with this, some measures relating to the possibility of allowing spe- 
cifically for various economie conditions in the future by the use of 
“representative” trends from appropriate past periods should be ob- 
tained. Because any period represents unique experience, projections 
for additional combinations of periods should be studied. In addition, 
the significance of the various measures should be tested. 

Conclusions —Although not one of the methods tested is clearly 
superior to the others, the cohort-survival (with migration), the ap- 
portionment, and the Ratio III (unadjusted) results make the con 
sistently best showings on the basis of average per cent error and 
proportion of errors of 10 per cent or more. 

_ The following hypotheses, while not proven: by this test, are con- 
sistent with the results obtained: е 
1. Projections obtained by these thee methods will be better guides 


D 


METHODS OF PROJECTING STATE POPULATIONS 491 


for ten and twenty years in the future than the most recent data on 
current population size. 

2. The value of the use of independent control totals is questionable. 

3. The errors of projections tend to increase almost directly as the 
length of the projection period increases. 

4. The errors of projections tend to be larger for areas with smaller 
base populations. 

5. The errors of projections tend to be larger for areas with larger 
net migration rates in the,recent past.!? 


APPENDIX 


Discussions of the accuracy of population projections will be found 
in the following items: 


Davis, J. S., The Population Upsurge in the United States, War-Peace Pamphlets 
No. 12, Stanford: Food Research Institute, Stanford University, 1949. 
Dorn, Harold F., “Pitfalls in Population Forecasts and Projections,” J ournal of 

the American Statistical Association, 45 (1950), 311-34. 

Hotelling, Harold, “Differential Equations Subject to Error and Population 
Estimates,” Journal of the American Statistical Association, 22 (1927), 283- 
314. 

Myers, Robert J., “Comparison of Demographic Rates Assumed by the National 
Resources Committee with Actual Experience," Journal of the American 
Statistical Association, 38 (1943), 201-09. A 

Schmitt, Robert C., and Albert H. Crosetti, “Accuracy of the Ratio Method for 
Forecasting City Population,” Land Economics, XXVII (1951), 346-48. 

Schultz, Henry, “The Standard Error of a Forecast from a Curve,” J ournal of the 
American Statistical Association, 25 (1930), 139-85. 

Shryock, Henry 8., Jr., “Forecasts of Population in the United States," Popula- 
tion Studies, III (1950), 406-12. 

Wilson, Edwin B., and Ruth В. Puffer, “Least Squares and Laws of Population 
Growth,” Proceedings of the American Academy of Arts and Sciences, 68 
(1933), 285-382. Z 


Descriptions of various methods of projecting population, references 
to pertinent literature, comments on the development of projections, 
and a priori evaluations of various methods, will be found in the follow- 
ing items: 


Hagood, Margaret Jarmanę and Jacob S: Siegel, “Projections of the Regional . 


зв An article by Robert C. Schmitt and Albert H. Crosetti entitled “Асе шу of the Ratio Method 
for Forecasting City Population” (Land Economics, XXVII (1951), PP- 346-48), has just come to the 
attention of the author, This article desoribes a test of the accuracy of the ТАШУ method in predicting 
the population of selected large cities and of variations in ассы with length of projection period, 
size of population, and growth rate. The findings of Schmitt and Crosetti are in agreement with (4) 
above but not with (5) above. It is possible that the resulte would agree if coefficients of partial correla- 
tion had been used to measure the associationofaccuracy or projections, size of base population, and 


growth rates or migration rates? к 
e 


492 AMERICAN STATISTICAL ASSOCIATION JOURNAL, § 


Distribution of the Population of the United States to 1975,” 
^ Economics Research, ТЇЇ (1951), 41-52. 
ety Notestein, Frank W., et al., The Future Population of Europe andthe Soviet l 
Г Geneva: League of Nations, 1944, 199-234. 
Reed, Lowell J., “Population Growth and Forecasts,” The Annals ofthe A 
Academy B Political and Social Science, 188 (1936), 159-66. 
Taeuber, Irene B., “The Development of Population Predictions in Europe 
the Americas,” Estadistica, 7 (1944), 323-46. 
Taeuber, Irene B., “Current Items—Literature on Euture Populations, 1 
1948,” Population Index, 15 (1949), 2-30. 
Whelpton, P. K., Needed Population Research, I ancaster, Pennsylvania: 
Science Press Printing Co., 1938, 1-11. t 


^ 


5 


| ‚ METHODS OF PROJECTING STATE POPULATIONS i - 493 


^ f ; 
APPENDIX TABLE A.—PROJECTIONS TO 1940 OF THE 1930 . 
M POPULATIONS OF THE STATES, BY SELECTED METHODS 


| 
| (In thousands. Each figure has been independently rounded) 


Unadjusted to national total 
1940 
enumer- 
Btate ated 
popu- 
lation 
United States — [131,669 |131,805 [132,098 [143,401 | 139,423 
Араша 2,833 | 2,801 | 3,024 | 2,973 | 2,987 | 2,646 
Arizona 49 | 5| 491| 504 535 436 
Arkansas 1,949 | 1,808 | 2,113 | 1,960] 1,954] 1,854] 1 
j California. 6,907 | 0,808 | 5,810 | 9,290 | 7,873 |. 5,677 
Colorado 1,123 | 1,082 | 1,104} 1,139 | 1,130| 1,036) 1 
Connecticut, 1,709 | 1,726 | 1,082 | 1,863 | 1,828| 1,607 
Delaware эвт | 248| 249] 254 258 238 
District of Col, 663 513 488 540 535 487 
Florida 1,807 | 1,716 | 1,557 | 2,208 | 1,956] 1,468] 1,859 
Georgia 3,124 | 2,915 | 3,308 | 2,921| 2,921) 2,909) 2,877) 2,873 
Idaho 525 455 504 458 458 445 453 458 
Illinois 7,897 | 8,177 | 7,933 | 8,943 | 8,748 |. 7,031] 8,201) 8,278 
Indiana 3,428 | 3,405 | 3,398 | 3,570| 3,539| 3,239] 3,328) 3,828 
Towa 2,538 | 2,497 | 2,649| 2,538 | 2,536] 2,471 | 2,441) 2,498 
Kansas 1,801 | 1,940 | 2,035 | 1,997 | 1,0990] 1,881] 1,887] 1,884 
у Kentucky 2,846 | 2,732 | 2,955 | 2,823 | 2,808 | 2,615 | 2,005) 2,661 
Louisiana 2,364 | 2,265 | 2,857 | 2,446 | 2,997] 2,102} 2,261) 2,268 
Maine 847 823 852 827 826 797 786 785 
Maryland 1,821 | 1,730 | 1,710} 1,831] 1,809] 1,632] 1,724) 1,721 
Massachusetts | 4,317 | 4,458 | 4,398 | 4,677| 4,637 | 4,250] 4,423) 4,416 
Michigan 5,256 | 5,550| 5,230] 6,349| 5,988] 4,842] 5,653] 5,045 
Minnesota 2,702 | 2,07 | 2,707 | 2,749 | 2,786] 2,504] 2,508) 2,505 
Mississippi 2184 | 2,137 | 2,203 | 2,033] 2,24] 2,010] 2,085) 2,082 
Missouri 3,785| 3,709 | 3,815 | 3,864 | 9,840] 3,020 | 3,045 | 3,009 
Montana. 559 528 588 527 527 538 535 54 
Nebraska 1,316 | 1,415 | 1,501 | 1,463] 1,488] 1,878] 1,384) 1,382 
Nevada 110 92 95| 107 104 91 98| .., 98 
NewHampshire] 492|  476| 480] 488] * 487 465 468 Lae 
New Jersey 4,160 | 4,488 | 4,218] 5,444 4,905 | 4,041 | 4,607 | 4,000 
New Mexico 532|  400| | 4 |  485| 423] 47) 46 
New York 13,479 | 13,709 | 12,989 | 15,187 | 14,787 | 12,588 | 13,798 | 18,778 
North Carolina | 3,572 | 3,575 | 3,082 | 3,007| 3,767 | 3,170] 8,521) 3,515 
North Dakota 642 702 790 716 714 681 680 679 
Ohio 6,908 | 7,107 | 6,956 | 7,04 | -7,512 | 6,647 | 7,108) 7,158 
Oldahoma 21336 | 21820 | 2,785 | 2,819) 2,755] 2,806] 2,011 | 2,007 
Oregon 1,090 | 1,028] 978| 1,156 | 1,120 foal LESE trae tot 
Pennsylvania | 9/900 | 10,986 | 10,203 | 10,612 | 10,520 | 9,681 | 10,009 | 10 Dt 
Rhode Island тїз, 732| 717| 78 769 687 725 И 
South Carolina | 1,900 | 1,781| 2,000| 2,794 | 1,792 | 1,739) 1,788) 1,7 
South Dakota 643 728 783 753 748 693 706 oy 
Tennessee 2,916 | 2,754 | 2,937 | 2,920| 2,888 VE 
Texas 6,415 | 6,469 | 0,500 | 7,236 | 6,958 En 
Utah 550 559 584 572 565 $40 
Vermont 359| 365| 379| 307 367 US 
j Virginia 2,678 | 2,495 | 2,682 le 2,537 | 2,582) 2, BSE Се 
Washington 1,736 | 1,058 | 1, 1.795 | 1,1651 1 пекара 
- West Virginia | 1,902 | 1,013 | 1,979 | 2,085 1,988) 1,729 ues Ee 
Wisconsin 3,138 | 3,193 | 3,195 | 3,278] 8,2388| 2,939) 3, p 
Wyoming 251| 247| 251| 280 25P| 226 245 


| * With migration, 


494 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 | | 
APPENDIX TABLE A.—Continued 
(In thousands. Each figure has been independently rounded) 1 
Adjusted to T&W national projection* 
| 
State : Ratio I : Ratio III 
Geometric | Arithmetic | APPortion- | (вур | Ratio П | (1900 to 1980 | 
ment 1930) (1980) | modified) 
United States 131,865 | 131,865 | 131,865 | 131,865 131,865 131,865 | 
Albama 2,734 2,778 2,805 2, 66% 2,842 2,751 
Arizona 519 506 490 673 468 520 
Arkansas 1,802 1,848 1,909 1,965 1,992 1,869 
California 8,543 7,446 6,875 6,989 6,098 7,699 
Colorado 1,047 1,068 1,087 1,477 1,112 1,068 
Connecticut 1,713 1,729 1,727 1,648 1,726 1,749 
2 Delaware 234 240 247 224 256 239 
District of Col. 497 506 513 514 523 507 
Florida 2,026 1,850 1,734 1,741 1,577 1,839 
Georgia 2,686 2,763 2,915 2,888 3,124 2,845 
Idaho 421 433 452 646 478 AT 
Illinois 8,223 8,974 8,240 7,820 8,196 8,256 
Indiana 3,283 3,347 3,403 3,099 3,478 3,814 
Towa 2,334 2,899 2,507 2,400 2,654 2,428 
Kansas 1,836 1,882 1,940 2,083 2,020 1,877 
Kentucky 2,596 2,656 2,720 2,519 2,808 2,650 
Louisiana 2,250 2,267 2,263 2,136 2,257 2,247 
Maine 761 781 813 712 856 784 
Maryland 1,684 1,711 1,728 1,582 1,752 1,705 
Massachusetts 4,300 4,386 4,461 4,338 4,564 4,411 
Michigan 5,838 5,663 5,407 5,195 5,201 5,629 : 
Minnesota 2,528 «2,588 2,658 2,901 2,754 2,585 
Mississippi 1,870 2,103 2,127 1,991 2,159 2,078 
Missouri 3,553 3,641 3,749 3,521 3,898 3,625 
Montana » 484 498 538 765 577 528 
Nebraska. 1,845 1,379 1,421 1,727 1,480 1,376 
Nevada 98 99 98 92 98 97 
New Hampshire 449 461 477 422 500 462 
New Jersey 4,730 4,639 4,513 4,391 4,341 4,654 
New Mexico 455 458 457 462 455 450 
New York 13,905 13,938 13,761 12,804 13,520 13,074 
North Carolina | 3,503 3,562 3,496 3,244 3,405 3,482 
North Dakota 658 675 699 1,898 731 676 
Ohio 7,029 7,05" |. 7,119 6,620 7,139 7,138 
Oklahoma 2,502 2,605 2,502 3,521 2,578 2,504 
Oregon 1,063 1,059 1,044 1,174 1,024 1,051 
Penneylvania 9,758 9,950 | 10,116 9,732 10,344 10,008 
Rhode Island 717 727 732 712 738 723 
South Carolina | 1,650 1,695 1,768 1,727 1,867 1,738 
South Dakota 692 707 723 1,108 744 702 
Tennessee 2,685 2,732 2,765 2,532 2,810 2,703 
202886 6,654 6,581 6,443 6,791 6,256 6,545 
tah 526 534 539 580 545 531 
усш 337 347 363 316 386 339 
паша 2,333 2,395 2,482 2,334 2,601 2,427 
Washington 1,651 1,669 1,673 2,545 1,679 1,647 
West Virginia 1,871 1,880 1,871 1,846 1,857 1,887 
Wisconsin 3,010 3,063 3,102 2,980 3,157 3,080 
Wronin » 20 242 242 316 242 24 


* With migration. 


T 


| 


pe METHODS OF PROJECTING STATE POPULATIONS 495 


1 APPENDIX TABLE A.—Continued 
ү (In thousands, Each figure has been independently rounded) 
Adjusted to census count 
Cohort-survival Ratio III 
кле тай) Geo. | ан | Appor- | 88891 | atio EEI чы, 
With No metric | metic |tionment| 1930) (1930) ОШ 
migration| migration fied) 
United States 131,009 | 131,669 | 131,669 [131,669 [131,669 [131,669 181,669 131,669 
Alabama 2,797 | 3,0 | 2,730 | 2,774 | 2,802 | 2,660) 2,898 2,747 
Arizona 512 489 518 505 488 672 407 519 
Arkansas 1,805 | 2,100| 1,800 | 1,86 | 1,908| 1,962 | 1,989 1,806 
California 6,798 | 5,790 | 8,530 | 7,435 | 6,849 | 6,978 | 6,080 7,687 
Colorado 17080 | 1,100| 1,046 | 1,007 | 1,086 | 1,475 | 1 „111 | 1,007 
Connecticut 1723| 1,676] 1,711 | 1,726 | 1,725) 1,646) 1 ,728 | 1,746 
Delaware 248 248 234 239 246 224 256 239 
District of Col. 512 486 496 505 513 514 522 506 
Florida 1,714 | 1,552| 2,023 | 1,847 | 1,729 | 1,738 1,575 | 1,836 
Georgia 2911 | 3,297 | 2,682 | 2,758 | 2,915 | 2,884 3,119 | 2,841 
Idaho 454 502 421 432 452 645 477 446 
Illinois 8,165 | 7,906 | 8,211 | 8,262 | 8,227 | 7,808 8,183 | 8,243 
Indiana 3,400 | 3,987 | 3,278 | 3,842 | 3,999 | 3,094 3,473 | 3,809 
Iowa 2:493 | 2,040 | 2,930| 2,395 | 2,506 | 2,996 2,650 | 2,425 
Капвав 11937 | 2,028 | 1,833 | 1,879 | 1,999 | 2,080 2,017 | 1,874 
Kentucky 2/728 | 2,945 | 2,592 | 2,652 | 2,718 | 2,515 2,804 | 2,046 
Louisiana 2,202 | 2,349 | 2,246 | 2,204 | 2,250 | 2,133 2,254 | 2,244 
Maine 822 849 760 780 813 7 855 788 
Maryland i,727| 1,704| 1,681 | 1,708| 1,726 | 1,580 1,750 | 1,702 
Massachusetts | 4,452 | 4,383 | 4,294 | 4,970 | 43457 4,332 | 4,557 | 4,404 
Michigan 5,542 | 5,212 5,829 | 5,085 | 5,454 | 5,188 5,193 | 5,621 
Minnesota 2,043 2,758 | 2,524 | 2,584 2,656 | 2,897 | 2,750 2,581 
Mississippi 27134 | 2,285 | 1,867 | 2,100| 2,124) 1 ‚988 | 2,155 | 2,070 
Missouri 3,704 | 3,802| 3,547 | 3,635) 3 утат | 3,516 | 3,802] 3 1620 
Montana 538 764 877 527 
Nebraska 
Nevada 
New Hampshire 
New Jersey 
New Mexico 
New York 


North Carolina 
North Dakota 
Ohio 
Oklahoma 
Oregon 
Pennsylvania 
Rhode Island 
South Carolina 
South Dakota 
Tennessee 
Texas 

Utah 

Vermont 
Virginia 
Washington 
West Virginia 
Wisconsin 
Wyoming 


496 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 ~ 


APPENDIX TABLE A.—Continued 
(In thousands. Each figure has been independently rounded) 


Unadjusted to national total 


Ratio III (1900 to 
1930 modified) 
Geo- Arith- | Con- 
metric | metic | stant |T&W na- 
tional pro-| Census 
jection* | count 
169,729 |153,071 |122,775 | 141,209 |154,136 
3,341 | 8,228 | 2,646 | 2,845 | 3,105 
T31 633 436 608 664 
2,072 | 2,054 | 1,854 1,889 | 2,062 
15,203 | 10,068 | 5,077 | 10,188 | 11,121 | 
1,253 | 1,223 | 1,036 1,115 | 1,217 
2,161 | 2,048 | 1,607 | 1,804 | 2,035 
272 268 238 243 205 
600 583 487 530 578 
3,307 | 2,443 | 1,468 2,220 | 2,423 
2,933 | 2,033 | 2,909] 2,833 | 3,002 
594 465 472 471 445 457 498 
8,738 | 8,509 | 8,015 | 10,480 | 9,866 | 7,631 | 8,783 | 9,587 ` 
9,9000 | 8,522] 3,509] 3,936 | 3,840 | 3,239 | 3,375 | 3,684 
2,644 | 2,502| 2,709 | 2,607 | 2,602 | 2,471| 2,402 | 2,622 
1,908 | 1,973 | 2,159| 2,120| 2,099 | 1,881 | 1,878 | 2,050 
2,944 | 2,851] 3,314 | 3,049 | 3,001 | 2,615 | 2,687 | 2,933 
2,699 2,396 2,599 | 2,848 | 2,693 | 2,102 2,977 | 2,595 І 
924 841 908 858 855 797 772 843 
2,827 | 1,000 | 1,759] 2,055 | 1,086 | 1,632] 1,786 | 1,950 
1,718 | 4,597 | 4,454 | 5147 | 5,025 | 4,250 | 4,531 | 4,945 
6,412) 6,140} 5,511 | 8,324 | 7,133 | 4,842 | 6,335 | 6,914 
3,008 2,693 2,931 | 2,948 | 2,909 | 2,564 2,607 | 2,846 
2,187 | 2,239 | 2,570] 2,057 | 2,438 | 2,010] 2,131 | 2,326 
8,989 | 3,730] 3,030 | 4,113 | 4,069 | 3,029 | 3,631 | 3,964 
595 517 627 516 538 530 578 
1,434 | 1,602 | 1,552 | 1,537 | 1,378 | 1,378 | 1,505 
96 97 125 18 91 104 118 
485 491 512 509 465 459 501 
4,818 | 4,285| 6,548 | 5,769 | 4,041 | 5,244 | 5,724 
510 568 580 546 423 482 526 
14,998 | 12,623 | 18,322 | 16,886 | 12,588 | 14,718 | 16,066 4 
3,952 | 4,246] 4,815 | 4,363 | 3,170] 3,704| 4,141 1 
716 807 752 747 681 674 736 
7,433 7,127 | 8,791 | 8,378 | 6,647 7,541 | 8,231 
2,797 | 3,154 | 3,317 | 3,114 | 2,390 | 2,772 | 3,025 
1,073 980| 1,400| 1,286 954 | 1,183 | 1,292 
10,376 | 10,650 | 11,693 | 11,410 | 9,631 | 10,406 | 11,358 
765 733 884 850 687 751 820 
1,822 | 2,291 | 1,851 | 1,246 | 1,739 | 1,760] 1,921 
757 868 817 803 693 712 777 
2,853 | 3,244 | 3,260 | 3,160 | 2,617 | 2,782 | 3,037 
6,974 | 7,101 | 8,990 | 8,091 | 5,825 | 7,199 | 7,858 
610 663] 645] 622 508 560 | 612 
368 397 374 374 300 323 858 
2,552 | 2,952 | 2,658 | 2,642 | 2,492| 2,462] 2,687 
1,712 | 1,632 | 2,062 | 1,967 | 1,563 | 1,790 | 1,954 
2,088 | 2,238 | 2:394 | 2,247 | 1,720 | 2,045 | 2,232 
3,253 | 3,408 | 3,645 | 3,538| 2,039 | 3,190 | 3,488 
203 271 301 286 226 259| 28 


a 


METHODS OF PROJECTING STATE POPULATIONS 


State 


Adjusted to T&W national projection* 


Geometric | Arithmetic A panir 

United States 138,442 | 138,442 | 138,442 | 138,442 
Alabama 2,725 2,863 2,920 2,617 
Arizona 598 562 529 914 
Arkansas 1,690 1,822 1,948 2,021 
California 12,400 8,931 7,742 8,030 
Colorado 1,022 |* 1,085 1,124 1,883 
Connecticut 1,762 1,817 1,815 1,647 
Delaware 221 238 252 222 
District of Col. 489 517 532 526 549 
Florida 2,697 2,167 1,927 1,938 1,656 
Georgia 2,393 2,602 2,920 2,824 3,280 
Idaho 385 418 457 831 502 
Illinois 8,549 8,751 8,682 7,822 8,604 
Indiana 3,211 3,406 3,521 2,949 3,652 
Iowa 2,126 2,308 2,532 2,298 2,786 
Kansas 1,729 1,862 1,984 2,215 2,121 
Kentucky 2,487 2,662 2,796 2,895 2,948 
Louisiana 2,323 2,389 2,380 2,182 2,370 
Maine 700 758 824 609 899 
Maryland 1,076 1,762 1,523 1,840 
Massachusetts 4,198 4,457 4,319 4,792 
Michigan 6,789 6,827 5,385 5,460 
Minnesota 2,404 2,580 3,129 2,801 
Mississippi 1,678 2,162 1,952 2,206 
Missouri 3,355 3,609 * 8,992 4,092 
Montana 457 457 983 606 
Nebraska 1,266 1,364 1,994 |. 1,554 
Nevada 102 104 97 103 
New Hampshire 418 451 388 525 
New Jersey 5,341 5,117 4,596 
New Mexico 473 484 485 
New York 14,944 14,979 12,751 
North Carolina 8,927 3,870 3,240 
North Dakota 614 663 2,954 
Ohio 7,170 7,432 6,479 
Oklahoma 2,705 2,762 4,624 
Oregon 1,142 1,141 1,357 
Pennsylvania 9,537 10,121 9,636 
Rhode Island 721 754 720 
South Carolina 1,510 1,638 1,689 
South Dakota 667 712 1,551 
Tennessee 2,659 2,803 2,437 
Texas 7,832 7,177 7,490 
Utah 526 552 623 
Vermont 305 331 217 
Virginia 2,168 2,343 2,220 
Washington 1,682 1,745 3,627 
West Virginia 1,953 1,993 1,897 
Wisconsin 2,973 3,138 2,949 
Wyoming f 246 254 401 


* With migration. 


TIIHT 


БЕ 


- 
ET 
= 


ЗЕГЕ; 


BO ori dà CB юы 


ЕЧ 
© 


BESESSERSEERSSESS 


m 
Sess eR a 


LE 


= 
= 


1 


ла 
= 


Бы 


оюк ы 


- t =з 
BESRRSRESSE 


` 


498 
AM 
‘ERIC. 
AN 
STATIS! 

TICAL Aj 

ASÍ 
SOCIATION JO 

URN. 
A 
L, SEPTEMB 
ER 

юм ` 


Btate 
Mes Adjusted to 
| qe | m 
S W) LEA e 
ith mi- G 
Yves States gration No mi- | ™ eo- | Arith- 
bama 8 gration е | m - | Api a 
Arizona 15 iege р ar 
x 1,116 |15: tionmen: (187 X Ratio 
rkansas 3,188 1,116 |15; t АД Ratio II pee 
Qu У H 633 mae us D 30) | (1930) пе 
10 "075 | 3. 6 0 
Connect $1310 2,562 650 3,125 151,116 |1 modi- 
Баана 1,213 6,228 1,844 613 3,141 51,116 |151 eo) 
District of 1,980 1,256 13,536 1,989 | 2 604 2,856 Li mi 
Florida Col. roe eid 1,115 9,749 ,024 997 "257 | з ‚116 
Georgia A iram 1,924 1,185 5,412 2,2061 2 a | 3 
Tdaho 2,078 500 242 1,983 1,195 8,705 283 | 2 044 
Illinois 3,180 1,754 534 260 1,982 2,055 6,988 E 
paias 508 4,017 im Д 565 264 1,798 ВЕЕ е 
o 9,288 609 ,612 ,366 569 242 ,978 y 1 
как 3,844 8,680 420 2,840 2,208 | 2 514 за ee 
лы im | бш san эмы mo| som з soo| a 
аша ans Hs 2,321 dn 9,883 "907 3,580 E 
Maryla: 2,615 3,589 1,887 2,519 3,750 8,538 548 ,014 
trm 2 2 3 9,3 483 
Mi THAT 918 2,815 2,714 ‚032 ‚582 3:09 9 on aie 
Mes id 1,965 978 2,536 2,906 2,006 ‚509 3,0% зч 
linnesota 5,018 1,905 764 2,607 2,943 2,418 ^о | 2, 5 
Mississippi 6,702 4,824 1,829 A нн 2,014 2,315 2,601 
Mimi 2,040 5,908 4,582 1,923 846 2,327 5,418 Risa 
onta! 2,444 3,17: 7,41 4,865 1,933 665 ,587 ‚896 
Nebraska 4,072 27 eal ош; 4,909 1,662 Е. 
Rone A adil ot arp deus cest ет TE E 
ew Ha 1,565 679 3,662 2,360 2,857 5,878 "931 | 4, 1 
ады os | on ele 2,374 3,415 5,960 ‚907 
New Мейо 52 iie rcd do: 1600 2,131 3,156 6,840 
Nor ee eee ies STU dies Vd зан 2,88 
Маан ^ 557 4,640 456 114 1,513 1,073 ,407 RS 
orth Р: olina 15,716 615 5,830 492 114 2,176 e| ~ 8 
SE akota 4,314 14,071 516 5,586 502 106 1,000 | 1 И 
pcm 782 4,598 16,313 529 5,511 423 112 e 
gon 8,114 971 4,287 16,350 528 5,017 573 10 
ата : 3,053 7,718 670 4,224 16,244 529 4,074] 5 407 
ode Island ied ыы т 7 4,185 13,918 521 ‚616 
Sout Island in’ m| 1 16 | 2; 5 з | ^ 5 TE E 
th Carolina 1,826 1,061 ‚853 ‚112 737 ‚5836 ‚494 | 15 
South Dak 835 11,534 1,247 3,015 8,119 2,569 3,902 pis 
ccn ра 1,989 794 10,411 1,245 3,006 7072 | 8 838 pu 
m: ER Ee s тт | Cen uu pet a 8,143 
Ve 114 940 ,038 ,144 ,481 , 2:9 
Virginia тше oo [a EC P ed Rd as | me Aas er 
Washington 666 690 298 гў 777 ,830 | 1 786 5н 11.143 
West 402 717 1004 ,060 | з 786 Lee А 6 a 
Virginia 2,786 430 574 7,834 ,079 wl ^ 140 
Wien 1,869 3,197 Fe ues 7,752 2,660 853 1,878 
yoming 2,279 1,767 2,366 362 605 8,175 m 5 е; 
8,551 2,424 1,836 2,558 372 680 "eo | 7,146 
p 287 3,691 2,131 1,904 2,609 | 2 302 eos | 808 
293 3,245 2,176 1,907 433| 2 443 p 
208 3,426 2,170 3,959 ‚981 | 2 
ar 3,448 2/070 1,924 ‚619 
27 3,219 2,128 1,842 
7| 438 3.617 2,176 
278 3,451 
274 


FACTORS IN INTERPRETING MORTALITY 
AFTER RETIREMENT 


Ковевт J. MYERS 
Social Security Administration 


Currently there is considerable discussion as to the effect of 
compylsory retirement on the national economy and on the 
vitality and longevity of the individuals concerned, Some 
experiences would seem to indicate that retirement causes 
higher mortality than is standard for the ages concerned, Fre- 
quently, however, such conclusions are not warranted because 
the individuals who do retire under voluntary provisions tend 
to be those who are in poor health. When retirement is com- 
pulsory, such experience as is available does not indicate high 
mortality but this is probably, at least in part, due to the fact 
that many quite healthy workers are among the retired group 
in contrast with the situation under plans having voluntary 
retirement. There is no conclusive data currently on hand to 
indicate for a given group of individuals what the effect of 
retirement on mortality really is depending upon whether 
similar groups of individuals could retire or could continue 
working. 


Г RECENT years considerable discussion has been given to the ad- 
vantages of continuing individuals in employment: beyond age 65 
rather than having compulsory retirement at that age, as is the case 
in many retirement plans. Such advantages are said to accrue both to 
the individual involved and to the nation. 

One of the advantages frequently claimed insofar as the individual 
is concerned is that a person compelled to retire loses his vitality and 
thus tends to die much earlier than if allowed to continue in gainful 
employment. This runs contrary to, ће viewpoint frequently ex- 
pressed many years ago that workefs were being compelled to remain 
at work because there was no pension plan to take care of them so 
that their inevitable end was death from exhaustion. Instead, it was 
advocated that such workers should be allowed to spend their declining 
years in peace and leisure while supported by a pension. 

Currently, there аге about 15,000 pension plans supplementary*¢~ 
the social security program. Many of these, following to some extent 
previous employer practice, provide for a compulsory retirement age 
(often at 65). In the majority of plans, retirement may be deferred with 
the consent of the employer. That retirement at age 65 is by no means 
universal is indicated by the, fact: that the average retirement age 
under the Old-Age and Survivors Insurance program is currently 69 


1 


499 . M 


500 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


for men and somewhat over 68 for women (in 1940-50 it was generally 
about one year higher). 

"This paper will examine the question as to the effect of retirement on 
mortality. Before proceeding further, let me issue the warning that no 
clear and definite conclusions will be or can be drawn because there are 
so many conflicting factors involved. 

Unfortunately, specific and reliable data on this subject are not avail- 
able. The analysis is complicated by the question as to whether people 
retire because they are disabled and are thus subject to high mortality, 
or on the other hand whether the high mortality is the result of retire- 
ment. Data as to the mortality of retired persons will be examined for 
several governmental retirement systems and for a few non-govern- 
mental pension plans in an effort to throw some light on this matter. 


EXPERIENCE TO BE EXPECTED IN VARIOUS TYPES OF PLANS 
Before proceeding to such actual data as are available, it will be 


worthwhile to examine briefly the effect that the particular provisions 


of the plan might have on the resulting experience. This is an ex- 
tremely important factor because completely different results may 


be obtained for what is essentially the same underlying mortality—all | 


depending on the structure of the benefits provided and the admin- 
istrative procedure adopted. 

In considering various possible hypothetical pension plans, let it first 
be assumed that mortality is not affected by retirement. Then we shall 
be able to see that any indications of lower or higher mortality following 
retirement, arise solely from the particular plan and its provisions. 

First, consider a plan which has no benefits payable before age 65— 
either early age retirements or disability retirements—but which has 
compulsory retirement at age 65 and which pays an annuity beginning 
at age 65 to those who previdusly left service because of disability. 
Under this plan, mortality after age 65 would, for the entire retired 
group, be fairly comparable with that previous to age 65, or with what 
might be termed the “general level.” Of course, as between those who 
were in active service when they attained age 65 and those disabled 

- hersons previously separated from service who receive an annuity at 
age 65, the former would experience lower mortality. 

Second, consider what the situation would be if the previous plan 
did not have compulsory retirement at age 65. For ages shortly after 
age 65, it is likely that the mortality experience would be higher than 
the general level because there would be a tendency for the less healthy 
lives to retire at or shortly after age 65 and for the healthier lives to 


б 


id 


INTERPRETING MORTALITY AFTER RETIREMENT DAT ei 501 


continue at work. After age 70, the mortality experience of the total 
retired would approach the general level of mortality because virtually 
everybody would have retired by then. 

Third, consider the case where disability pensions are provided (or 
where disabled persons receive no vested rights for a pension at age 
65). If retirement is compulsory at age 65, the experience for non- 
disabled retired workers will show definitely lower mortality than the 
general level at the ages shortly after age 65 but eventually will merge 
into the general level. If retirement is not compulsory at age 65, the | 
resulting mortality experience will probably be somewhat higher than 
the general level at the ages just beyond age 65 and not as high as for 
the group of disabled pensioners. 

Fourth, consider the experience under a plan which permits optional 
retirement before age 65. There is a subdivision between disability 
pensioners and others (as there well might be because of a differential 
in benefit amount favoring the former). The disability pensioners will 
experience quite high mortality, while the other pensioners, at least 
for a few years, will experience very low mortality. This latter group 
would undoubtedly obtain the larger disability pensions if possible 
and therefore must be considered to be quite select medically. 


EXPERIENCE UNDER OLD-AGE AND SURVIVORS INSURANCE PROGRAM 


The old-age and survivors insurance program covers some 80% of 
the paid civilian jobs in the country. In its actual eperation, a vast 
amount of valuable mortality experience has been accumulated. Un- 
fortunately, it has not been possible to tabulate and analyze all of this 
vast store of information, especially in regard to mortality data strati- 
fied by duration of retirement. 

In the early 1940’s a brief investigation was made as to select mor- 
tality by age and duration of retiremefft. This indicated that, as con- 
trasted with general population mortality, а person who has just re- 
tired has about 15% higher mortality. But this differential rapidly 
diminishes until after two or three years it has virtually disappeared. 

More recently it has been possible to make an investigation as to the 
over-all mortality experjence of retired workers—but only by attained 
age and not with regard to duration of retirement. This experience 1 
summarized in Table 1. For men there is very notable excess mortality 
at ages 65 and 66, but this differential gradually decreases for the older 
ages. This gives’ some indication of the higher mortality immediately 
after retirement. The effect thereof is diluted at the older ages аз In m 
of the experience is among continued lives rather than newly retired 


set a 


502 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 195 


ones. For women the same general tendency appears to be present 
though to a much smaller degree. The mortality of male reti 
workers at ages 75 and over very closely parallels population mortali 
but for women the retired workers have 10-15% lower mortality 
these ages and even at and shortly after age 65 the mortality is ve 
close to that of the general population. 

The old-age and survivors insurance data clearly indicate that 


TABLE 1 Y 

RATIOS OF ACTUAL TO EXPECTED DEATHS AMONG RETIRED 
WORKERS* UNDER OLD-AGE AND SURVIVORS INSURANCE 
SYSTEM, 1950-52» 


Age Men Women 
65 136% 90% 
66 145 107 
67 128 99 
68 121 98 
69 116 94 
70 , 115 90 
7 11 90 
72 ; 107 87 
73 106 85 
74 105 85 
75-79 99 82 
80-84 98 85 
85-89 101 90 
90 and Over 103 100 
All Ages 109 90 


E ja Si includes all persons who cldimod benefits even though some returned to work. 
Ds n deaths based on U.S. 1950 White Lives Mortality Tables. Actual deaths: men 367,00 


siderably higher mortality than standard arises for individuals 
have just retired, but this differential gradually reduces. This is 


_ticularly the case for men, although there is some indication of it al 
Being present for women, : 


EXPERIENCE UNDER THE RAILROAD RETIREMENT PROGRAM 


The railroad retirement program covering some 1} million wor 
may be said fo be a combination of an industry-wide private pe 
plan and a social insurance system bizce it contains elements of both: | 


INTERPRETING MORTALITY AFTER RETIREMENT 503 


In its actual operation, a very considerable amount of valuable mor- 
tality experience has been accumulated. In fact, it is the only large 
public retirement system for which good mortality data are available 
according to duration of retirement. 

Table 2 compares the actuarial rates used in cost valuations for 
mortality of active workers and retired persons for ages 65 and 70. 
These have been tested against actual experience to a certain extent. 
According to these figures it is not expected that the mortality of these 
two groups will differ greatly at those ages where most retirements oc- 
cur. 


TABLE 2 


TABULAR MALE MORTALITY RATES USED IN RAILROAD 
RETIREMENT SYSTEM* 


(per thousand) 


Ratio of Active 


Age Active Service Age Retirements to Retired 
65 30.2 30.2 100% 
70 45.4 50.2 91 


* Source: “Retirement Policies and the Railroad Retirement System,” Part 1, Senate Report No. 
6, 83rd Congress, Ist Session, pp. 341 and 357. ° 


Table 3 compares the ratio of actual to expected deaths among age 
annuitants during a recent 3-year period. The characteristics of this 
plan are such that individuals may retire before age 65, with larger 
benefits if permanent and total disability is proved than if the retire- 
ment is for “age,” and under certain circumstances retirement can be 
for “occupational” disability. ? 

The mortality for age retirements et"ages 60-64 is as much as 25% 
below the expected level during the early years of retirement although 
ultimately the mortality of this group approaches very close to ne 
of the life table used as the basis of determining expected deaths. Ad 
the other hand, for those retiring at ages 65-69, actual mortality is ap- 
preciably higher than expected mortality— particularly in the first two 
Years of retirement. This could be anticipated because those — 
age 65 who are in better health tend to continue at work and conversel, y 
those in poorer health retire. Those retiring exactly at age 65, relatively 
do not show as much excess mortality in the first few years of retire- 
ment as those retiring at ages 66-98. For those retiring st ages 70 and 
over, the mortality experience®is quite close to that expected and 


e . « 


до 


504 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SE 


shows no significant fluctuation with duration of retiremen: 
„might well be expected because age 70 is by employer practice virtu: 
a compulsory age on most railroads. Accordingly this group is a goc 
cross-section of persons of those ages—although perhaps somew! 1 
healthier because they have been in employment up to that age. 


TABLE 3 


RATIO OF ACTUAL TO EXPECTED DEATHS ‘AMONG RAILROAD. 
RETIREMENT AGE ANNUITANTS, BY DURATION OF 
RETIREMENT, 1947-56* 


Duration of Retirement (Years) 


Age at 
b 
Retirements! 0 1 2 3 4 
65 112% 110% 99% 95% 89% 
66 135 118 115 99 105 
67 123 114 109 116 124 
68 141 132 110 109 105 
69 111 110 95 107 97 
60—64 14 87 75 78 96 
65-69 121 114 104 101 99 
70 and Over 103 . 93 103 101 104 
All Ages 114 107 102 100 100 


. ^ Based on data furnished by Office of Director of Research, Railroad Retirement Board. 8 
in summary form are contained in Table A-2, Annual Report of the Railroad Retirement Board for 
Fiscal Year Ended June 30, 1952 (but shown there by attained age rather than age at retire! 
Expected deaths based on 1944 Railway Annuitants Mortality Table, set back 1 year in age. 
deaths: 25,545. 


b Age last birthday. 


Consideration of the railroad:retirement data, in view of the sp 
provisions of that program, indicates quite clearly that the morta 
of those who retire at and after age 65 is relatively high in the 
few years of retirement. There is no conclusive evidence that 
higher mortality is due to the act of retiring, but rather it seems pr 

. able that the retirements were to some extent caused by ill hea 
which would have produced higher mortality anyhow. 


EXPERIENCE UNDER CIVIL SERVICE RETIREMENT SYSTEM 


The civil service retirement system covers some 13 million 
ployees of tho Federal Government and so is, in effect, a large 
administered pension plan. In general,*depending upon length of 


INTERPRETING MORTALITY AFTER RETIREMENT 505 


ice, age retirement on full annuity can occur at ages 60 or 62. Prior to 
then, in certain cases, both disability and age retirement benefits are- 
available, but the latter are in a reduced amount so that any disabled 
person would attempt to have his retirement adjudicated as due to 


disability. 


Table 4 indicates the difference between the actuarial rates used 
in valuation of the system for the mortality of persons in active service 
and those who have retired on account of age. These two sets of figures 
should not be considered, as reflecting the actual experience but rather 
give some indication of what is expected from an actuarial standpoint, 
For ages 60 to 70, the mortality of persons in active service is indicated 
to be some 30-50% lower than for persons who have retired on account 


of age. 
TABLE 4 


TABULAR MALE MORTALITY RATES USED IN CIVIL 
SERVICE RETIREMENT SYSTEM* 


(per thousand) 


Ratio of Active 


Age Active Service Age Retirements to Retired 
60 14.1 20.8 68% 
65 17.6 30.9 57 

70 21.5 46.5 . 46 


Y ® Source: Tables 27 and 31, 22nd Annual Report of the Board of Actuaries of the Civil Service Re- 
tirement and Disability Fund for the Fiscal Year Ended June 30, 1942. 


Unfortunately, select data according to duration of retirement are 
not available for this system. Table 5, however, does show the ratio of 
actual to expected deaths by attained age for age retirements during 
a recent 3-year period. For men, the mortality experience under age 60 
which is in respect to individuals who voluntarily retired on a reduced 
annuity and thus apparently could not prove disability was relatively 


low, just as was the case in the railroad re 


ment data. For attained 


ages 60-66, mortality is definitely higher than that according to the .. 
standard table, while at the older ages, the two tend to come together. 


Since this is an aggregate experience fo 


r all ages of retirement com- 


bined, it would be expected that this would occur at least after age 70, 
which is the compulsory retirement age. For women, the same general 
trends are evident except that {here are greater fluctuations in the 
mortality ratios due,to the smaller number of persons involved and 


506 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 4 


Sie 


except that the mortality ratios for ages under 80 tend to show actual 
mortality well below that expected. This is not a significant factorin “ 
the experience as to the effect of retirement, but rather indicates that 
the standard table in use for women has too high mortality rates. 

The experience under the civil service retirement program seems to 


TABLE 5 


RATIOS OF ACTUAL TO EXPECTED DEATHS AMONG CIVIL 
SERVICE RETIREMENT NON-DISABILITY ANNUITANTS, 
FISCAL YEARS 1950-52* 


Age Men Women 5 
Under 60 99% 40% 

60 121 94 
61 109 65 
62 119 84 
63 113 79 
64 120 61 
65 119 79 
66 113 69 
67 103 79 
68 110 83 

69 102 66 l 
70-74 94 74 
75-79 95 84 
80-84 ^" 97 100 
85-89 92 98 
90 and Over 93 106 
All Ages 99 81 


* Based on data furnished by Retirement Section, U. S. Civil Service Commission. Expected 
deaths based on tabular rates shown in Table 4. Actual deaths: men—16,307; women—1,561. 
confirm, in general, that of the railroad retirement system. Mortality 
is definitely lower than standard for those retiring at age retirements 
prior to the normal age and is definitely higher for those retiring at the 

І normal age and a few years later. Again, this seems to indicate that 

c2 the higher mortality shortly after retirement at or after the normal 

age is in considerable part due to the fact that ill health tended to 
cause retirement rather than vice versa. 


EXPERIENCE UNDER PRIVATE PLANS 


For many ‘years insurance companies have collected experience 
under the group annuity plans which they sell primarily to commer 


INTERPRETING MORTALITY AFTER RETIREMENT 507 


cial and industrial concerns. In general, the annuities are payable 
1 beginning at age 65 regardless of whether the individual retires at that 
age, although in actual fact he may not receive the payment. Two sub- 
divisions possible in the group annuity data are for “normal” retire- 
ments (generally payable from age 65 on) and “early? retirements, 
which in many—if not most—cases are disability retirements. As would 
be anticipated, the mortality under the “early” retirements is very high, 
especially at ages prior to 65, but subsequently tends to come closer 
to the mortality for the, “normal” retirements (see Table 6). On the 


TABLE 6 


RATIO OF ACTUAL TO EXPECTED DEATHS AMONG MALE 
SERVICE PENSIONERS IN THREE SELF-ADMINISTERED 
PRIVATE PENSION PLANS* 


Group Annuity 
(1946-50) Plan Аъ Plan Be Plan C4 


= (1943-52) (1946-51) (1946-51) 
“Normal” “Early” 
Under 55 . 312% 465% 
55-59 152% 248 280 
60-64 96 197 166 
65-69 102 149 116 * 86% 94% 
70-74 112 136 118 107 102 
75-79 119 137 123 138 117 
80-84 137 113 131 147 120 
85-89 124 ° 120 117 129 
90 апа Оуег 127 ° 128 ? 110 
All Ages 109 174 130 111 105 


* Source: Report of Special Committee on Experience under Self-Administered Retirement Plans, 
Transactions of the Society of Actuaries, 1953 Reports. Exfiected deaths based on 1937 Standard Annuity 
Mortality Table. Actual deaths: Plan А—5,316; Plan B—613; Plan C—1,072, 

b Group of public utilities covered under uniform plan. 

3 Electric utility company. 

А Large company in electrical manufacturing industry. 

Insufficient data. 


other hand, for the “normal” retirements, mortality shortly after age .— 


65 tends to be somewhat low since payments generally begin automati- 


cally at age 65 and are thus made to quite healthy lives since а consid- 
erable number of disabled lives have already been eliminated as a 


result of the “early” retirements. 


There has recently become avajlable the first results of a continuing 


study by a committee of the*Society of ‘Actuaries in regard to the 


5 . ‹ 


508 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


mortality experience under self-administered retirement plans. As dis- 
cussed previously, the resulting experience must be considered very 
carefully in view of the fact that the particular provisions in each plan 
will materially affect the results. 

Table 6 compares the actual and expected deaths among male serv- 
ice pensioners under three privately administered pension plans. It 
should be noted that the mortality table used as э. basis of the expected 
deaths is significantly too low at the extreme ages (beyond age 80) 
so that the mortality ratios developing tend 4o be artificially high. 

Plan A has compulsory retirement at age 65 but has disability pen- 
sions prior to that age, which are included in this experience. Accord- 
ingly, as would be expected, there are very high mortality ratios prior 
to age 65, while those after age 65 tend to be somewhat above the group 
annuity “normal” retirement experience, at least between ages 65 and 
75. 

Both Plans B and C have compulsory retirement at age 65 and have 
separate disability benefits before age 65, the experience of which is not 
included here. As a result for ages 65 to 75, these plans show very low 
mortality since those entering the experience upon compulsory retire- 
ment at age 65 tend on the whole to be quite healthy lives. Certainly 
this latter experience would, of itself, not seem to give any indication 
that compulsory retirement produces high mortality. 

SUMMARY 

The preceding analyses of the mortality experience under various 
governmental and private pension programs indicate quite clearly that, 
in the absence of any special circumstances, the mortality of retired 
workers during the first year or two of retirement is considerably above 
the general level which otherwise might be expected but thereafter 
merges with such general level. It seems likely that this higher mortality 
in the early years of retirement arises from the fact those in poorer 
health are more apt to retire at or shortly after the minimum retirement 
age, while the healthier individuals continue at work. 

An important factor to consider is that those retiring under a plan 
> which does not have compulsory retirement generally tend to be the 
less healthy lives. On the other hand, in a plan providing for com- 
pulsory retirement at a particular age, those still in service at that age 
generally tend to be somewhat healthier than the normal population 
since they have recently been at work. Thus, it would be completely 
erroneous to contrast the mortality under a plan with compulsory 
retirement and that under а plan with voluntary retirement if there 


> 


wy 


INTERPRETING MORTALITY AFTER RETIREMENT : 509. 


were considered only pensioners. The results would seem to indica: 
lower mortality for the compulsory plan, which would not be a valid . 
conclusion. It would really be necessary to contrast the mortality of 
pensioners under the compulsory retirement plan with that of both 
active employees and pensioners under the voluntary retirement plan. 
No such data were available to the author since usually mortality of 
active employees is not as closely studied as that for retired persons, 
particularly in governmentally administered plans. But if any progress 
is to be made in exploration of the subject of mortality after retirement, 
it will be necessary to obtain such data! 

The preceding discussion does not, however, mean that compulsory 
retirement might not have a serious effect on an individual's health 
and vitality, especially if he had not adjusted himself to the separation 
from employment. Unf ortunately currently available data do not meas- 
ure the effect of retirement on mortality -after retirement. A priori 
reasoning would seem to indicate that compulsory retirement would 
certainly have some deleterious effect on mortality for some persons, 

1 The Department of Sociology and Anthropology of Cornell University is currently conducting а 
longitudinal study on the effect of retirement on mortality and morbidity, as between retirants 
non-retirants under plans having different provisions ав to retirement policy. For а description of this 


study see Milton L. Barron, Gordon Streib, and Edward A. Suchman, “Research on the Social Dis- 
organization of Retirement,” American Sociological Review, 17 (1952). 


. 


D 


write. 


SAMPLING CONTROL OF LITERACY DATA* 


S. 8. ZarKovic 
Federal Bureau of Statistics, Belgrade, Yugoslavia 


An attempt is described to control the value of literacy data 
by the use of sampling methods. The reason for this research 
is the widely known unreliability of this sort of statistics in 
countries with a high rate of illiteracy. This research has been 
conducted as a part of the post-ertumeration survey, taken in 
connection with the Yugoslav census of population as of 
March 31, 1954. The aim of the survey was the control of ac- 
curacy and value of different census results. The value of 
literacy data was checked on the sample of individuals by 
means of reading and writing tests. The results show (i) that 
literacy is a continuous variable, and (i) the unreliable char- 
acter of literacy statistics is connected with the difficulty of 
defining the limit between the different levels of literacy. Since 
these limits cannot be defined in the census of population, the 
best method to check the value of literacy data seems to be the 
use of sampling methods. 


THE PROBLEM 


ATA on literacy are usually obtained in the census of population. 
Each person over a given age is asked about his ability to read and 


What is the value of data provided in this way? 


* The author wishes to express his indebtedness to Mr. S. Krasovec, formerly director of the Fed- 
eral Statistical Office, and to Mr. М. Macura, director of the Serbian Statistical Office, who spent a lot of 
energy to make this research possible. The research described in this paper belongs to the new field in 
statistics that could be labeled “The preblem of the value of statistical data.” The most important 
work in connection with this problem has been done in the USA and India. So far obtained results and 
experiences can be found in the following papers: M. H. Hansen, W. N. Hurwitz, E. S. Marks, W. Е. 
Mauldin: Response errors in surveys, Journal of the American Statistical Association, 46 (1951), 147-90, 
Р. C. Mahalanobis: Recent experiments in statistical sampling in the Indian Statistical Institute, 
Journal of the Royal Statistical Society, 109; P. V. Sukhatme, G. R. Seth: Measurement of non-sampling 
errore, Journal Indian Society Agriculture, Vol. 4; P. V. Sukhatme: Measurement of observational 
errors in surveys, Revue del'Institute Internationale Statistique, Vol. 20; М. Н. Hansen, W. N. Hurwitz, 
L. Pritzker: The accuracy of census results, American Sociological Review, 1953; W. E. Deming: On 
аа errors in surveys, American Sociological Review, Vol. 9; E. S. Marks, W. P. Mauldin: Problems of д» 
"sponse in enumerative surveys, American Sociological Review, Vol. 18; E. S. Marks, W. Р. Mauldin, 
A. Nisselson: А case history in survey design; The post-enumeration survey of the 1950 census, Journal 
of the American Statistical Association, 48 (1953), 220-43; С. L. Palmer: Factors in the variability of 
response in enumerative studies, Journal of the American Statistical Association, 38 (1943), 143-52; 
E S. Zarkovié: Completeness of enumeration (in Serbian), Federal Statistical Office, Belgrade, 1954; 
5 ү Accuracy of family budget data with reference to period of recall, Calcutta Stat. Assoc. Bul» 
хоз i M. Н. Hansen, W. N. Hurwitz, W. С. Madow: Sample Survey Methods and Theory, New York, 
Ка 1953; W. E. Deining: Some Theory of Sampling, New York, Wiley, 1950; S. S. Zarkovié: Popile 
Hon Census Errore (in Serbian), Federal Statistical Office; Belgrade, 1954. 


» 


510 


SAMPLING CONTROL OF LITERACY DATA 511 


The doubtful reliability of this sort of statistics is well known for it 
is clear that the answers are to a large extent the result of personal 
opinion of what literacy is. In Population Census Methods one reads: 
“The meaning of the data on literacy and illiteracy obtained in a 
population census depends obviously to an important degree upon the 
extent of reading and writing ability that is assumed by the enumer- 
ators and respondents to be required for an affirmative answer.” 

To illustrate the unreliable character of these data we shall mention 
two examples. 4 

In a European country with a very low percentage of illiterates, it 
was noticed after the mobilization during the last war that the per- 
centage of those unable to read and write was far higher than was 
found during the preceding census. The situation among women was 
still worse. It is obvious the literacy situation, as depicted by the 
census statistics, is rather vague. 

The next example is from Yugoslavia. The successive censuses gave 
the following percentages of illiterates: 


1921—50.5 
1931—44.6 
1948—25.4 
1953—24.9* 


From this it appears that in the first 10 years illiteracy decreased by 
6 per cent and in the next 17 years by 19 per cent*in spite of the fact 
the schools were practically closed during the war. But in the last five 
years, during the regular work of all schools, with a very expanded sys- 
tem of education, including a great number of courses for teaching the 
alphabets and compulsory learning of reading and writing for those in 
the military service, illiteracy remained*on the same level. There is 
something in this situation that devifftes from a logical pattern. 

On the basis of this figure for 1948, estimated with good reasons as 
being optimistic, while preparing for the census of 1953, we put in our 
program an investigation of the value of answers given by respondents 
on all census questions and consequently on the question of literacy 
as well. % айс 

This research proceeded in two directions. First, a sample of indi- 
viduals was drawn immediately after the census, each of whom was 
requested by a specially trained inspector to answer again all census 
questions. Here the interest was in the stability of answers obtained in 

E —— —— 


1 United Nations, 1949, p. 83. e? 
3 This figure is a sample estimate. * 
. 


512 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


the census and in the extent of errors in them. The second research, 
also based on the sample of individuals, had as its aim to check, by 
means of tests, the degree of literacy of those who declared in the 
census that they were literate. 


DATA ON THE SAMPLE* 


To facilitate the organization of the census the whole country was 
divided in 118,999 enumeration districts (e.d.) with an average of 142 
people, ranging in size from 0 to 300. For the purposes of sampling 
these e.d. were stratified in two strata, urban and rural. The urban 
stratum consisted of 29,805 e.d. and the rural one of 89,194. Before 

` the beginning of the census only the total number of e.d. was known 
in each administrative unit and their distribution in strata. On this 
basis a random sample of 149 e.d. was drawn in the rural stratum and 
100 in the urban one. Hach e.d. was assigned equal probability. 

Tn order to get individuals, subsampling was used. In the research 
on the stability of answers in the urban stratum the sampling fraction 
of 1:10 was applied to the total number of the enumerated people in 
an e.d. In the rural stratum the sampling fraction was 1:8. In this way 
a total of 1,682 people were investigated in the urban stratum and 
2,470 in the rural one. 

This gives the situation in Table 1. 


TABLE I 
SOME DATA ON THE SAMPLE 


Stratum 


Rural 
Urban 
Total 


Primary units б 

(e.d.) in the Secondary units Number of 
sample * (people) people in 

in the 

Number of sample 

peoplein | Percentage of 

Number | Percentage | the sample of the secondary 

of primary total units 

units 

149 0.167 19.818 0.163 1,082 

100 0.336 16.785 0.347 2,470 

249 0.209 36,603 0.216 4,152 


* Detailed information on this sample is given in 8. S. Zarkovich: Fatimati igures (in 
i: Y А . S. : Estimating Census Figures 
Serbian), series “Studies and Analyses,” No, 1, Federal Stitistical Office, Belgrade 1953. 


> 


SAMPLING CONTROL OF LITERACY DATA ew 513 


The selection of secondary units was arrived at on the basis of census 
questionnaires, concentrated at that time in the commune office. 

The task of inspectors was to find the people drawn into the sample 
and, using special control questions and available documents, to at- 
tempt to get the right answer on all census questions. In this work 
inspectors didn’t know the answers given by respondents during the 
census. 


These are data on the sample designed for the purpose of the general —. 


control of all the answers on census questions, à 

For the second research only those persons have been taken into 
account who: i) were, at the beginning of the census, 10 years of age 
and over, ii) put the answer “reads” and “writes” in the census ques- 
tionnaire, iii) had an education of 4 years of elementary school or less. 
Those having more education were considered as definitely literate. 

For this program the same sample of primary units Was used as in 
Table 1. Since the definition of the population now was changed, the 
census returns were used again to select a new sample of secondary 
units, In this selection the sampling fraction of 1:6 in the rural stratum 
and 1:8 in the urban one was applied. So the sample consisted of 417 
people in the urban stratum and 1,022 in the rural one. 

In addition, another small sample was drawn of those who declared 
themselves illiterate. The purpose of this sample was to check whether 
this group of the population was homogeneously illiterate. 

The individuals selected in this way came into a school where a test 
of their ability to read and to write was administered. Reading was 
investigated by means of 15 tests,! having each some printed phrases 
and three control questions that had to be answered. by marking the 
right answer. Each right answer represented one point. The maximum 
number of points was 45. The testing ofeach group Was limited to 10 
minutes. .“ б 

The ability to write was tested in а similar way. Our inspector dic- 
tated three phrases that had to be written in a limited time. 


ANALYSIS OF ERRORS 


When the field wowk was completed the control forms were matched 
against the census forms and the cases were defined as errors when the 
answers were not identical. So in connection with literacy 3.2 per cent 
wrong answers were found in the rural stratum and 2.8 per cent in the 
urban one. These percentages are calculated on the basis of total 
University of Belgrade, by B- 


8 ‘These tests were prepared in the Department of Psychology, 
tevanovich, N. Rot, Z. Vasich and М. JBvichich. ^ 


514 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


number of enumerated people in the e.d. All the present errors do not 
represent, however, the changed datum on literacy. Only 59.0 per cent 
(36 people) out of total number of errors in the urban stratum and 65.4 
(53 people) in the rural stratum represent the changes in the answer 
previously given. The remainder of errors covers the omissions of 
answers for the people of 10 years and over or giving answers for chil- 
dren under 10 years. 

"The next problem is as follows: is there any tendency among individ- 
uals to declare themselves literate when they, are not so or to declare 
themselves illiterate when really literate? Letting the first type of 
tendency be represented by + and the second by —, we found the 
following distribution of changed data: 


Males Females 
Stratum 
р = + = 
Urban 3 3 16 14 
Rural 5 8 15 25 


These figures do not agree with what is rather general opinion, viz. 
the existence of a marked tendency among individuals to declare them- 
selves literate even if not so. If there is a place here for any tendency it 
should be stated in just the opposite way (in accordance with the 
results for the rural stratum). 

This conclusion might appear somewhat hazardous since both 
minuses and pluses may conceivably represent errors in the second 
report. But we think it is safe within the best possibilities of the check- 
ing procedure in this field. Each changed answer was subject to a spe- 
cial investigation.’ So only sóme intermediate cases (vide infra) might 
represent the problem. At 


SOURCE OF ERRORS 


Now we face the very important practical question of the source of 
these errors. 


= c The information given by the respondents with the wrong census 


answers shows that two main sources of errors exist. In the first group 
the extreme cases are included, namely individuals being absolutely 


literate or illiterate. In the second group the intermediate cases are 
involved. ‘ 


n ETAT 
Batrana dues otn ef He еы procedure is giver in S. В. Zarkovich: Population Census Error 


SAMPLING CONTROL OF LITERACY DATA 515 


By the definition of the group, in the first case the answer on the 
question of literacy is known. If someone never learned reading and 
writing or if a person had a college education it is clear what the 
answer should be. But the errors still appear. For absolute illiterates 
we found answers “literate” and vice versa. 

In our census the source of these errors is the system of enumeration. 
In our system the questionnaires were distributed one day before the 
beginning of the census and collected the day afterwards. Meanwhile 
everybody was supposed to fill answers personally (if literate). For 
children and illiterate people the giving of answers was the duty of 
parents or some other member of the family, The enumerator had to 
check and correct data given or to put down answers in the case when 
no one was able to do it (in villages). 

Now, if a member of a family fills the questionnaires for the others 
he may not always be well informed on what the answer should be. 
It particularly holds for the people on the lower cultural level where 
no attention is paid to literacy. If the enumerators do this job on the 
basis of information given by the head of the family the errors appear 
in the same way. It would be the best if the enumerators had a separate 
talk with each respondent. In this case the number of errors would 
probably be less serious. 

Consequently, these errors can be influenced primarily by changing 
the system of enumeration (if there is any possibility to do so). The 
recommendation in Population Census Methods by which the “general 
adoption of the criterion . . . ability to read and write a simple mes- 
sage in any language, would help to improve the comparability and 
meaningfulness of census statistics on this subject” does not seem to 
be useful. 

In the group of intermediate cases the errors appear because the per- 
son in the low level of literacy declares himself literate and vice versa. 
Here the respondents don’t know what their answers should be like. 
To what degree should the ability to read and write be developed to 
entitle either of two possible answers? The problem is the limit that 
divides literacy from illiteracy. Considering the fact this limit can 
only be defined in terms of some units it is obvious that no system of | 
enumeration is likely to change the frequency of errors. It also seems 
that the above recommendation in Population Census Methods couldn t 
be expected to be helpful. We found a lot of people able to distinguish 
any letter and'read any word but the reading represented a tremendous 
effort for them in which they used 20 times more time than a man with 
a university education. Frometlfe point of view of the “ability to read 


516 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEP 


and write a simple message” they are literate but from any pi 
point of view they are illiterate. In general, such an individual. 
not use at all his ability to read and write because this is for h 
painful a job as any other in which great physical efforts а 

cerned. The limit of such a literacy is the illiteracy. 

MEASURING DEGREE OF LITERACY А " 
Literacy is a continuous variable with illiteracy and com; 
literacy on its extremes. Any intermediate value brings about th 
question whether it should be called literacy or not. The diffioulii 
appear because of the very nature of literacy as a census 
teristic. If it is desired to have a more precise insight in what 
literacy of the “literate” people really means, there is no other way- 

as it seems to me—than to draw a sample of “literate” individ 
apply some measuring and, on the basis of the results received, esti. 
mate the percentage of the people on different levels of literacy, — 
‘This was the aim of our second research. In connection with this th 

following had to be done: 1 
1) prepare tests for measuring, 
#4) define the limits between different classes of literacy in te 
units of these tests, 


tit) apply these tests and calculate the percentage of people in 


4 


‘The results received are shown here in Tables 2 and 3. In the 
of these tables is the reading score and in the headings, the wri 
score, 

These figures show the existence of correlation between the 
to read and the ability to write, Then, one sees that there is a pere 
age of people with a pretty high score in reading and a low О 
writing (the first two columns). ‘They also show that among the 


At the same time these data show that even in the class of the lite 

„ People (because in this research only those have been included 

declared themselves “literate” in the census) there is a number of 

who didn’t get any points in either reading or writing (first col 

ret row). These are illiterate. Most of these cases were separately й 

and their illiteracy was proved. It was found that thé 

presence in the class of literate people was due to the system of @! 

meration: those giving answers for thers put them in this group. —— 
On the basis of these tables, the possibility of estimating the T 


z 


SAMPLING CONTROL OF LITERACY DATA 


TABLE 2 E 
DISTRIBUTION OF SCORES IN READING AND WRITING 
(Urban stratum) 
Reading Writing score ^^. 
rr 3 4 7.4. 747 1& 2 
о 5 "quim WE NEC CE 
1-5 з 1.145812 98:080 1^0 RN 
6-10 4 - 2 4 б 6 je 6.4 lo, TEE 
1-15 - 1 & У т 5 4 10 7 58 
16-20 2 1 0) 8 2 7 9 12 и 
21-25 1 1 ЗЕЯ 5 13 о з 1 15 
2-30 - =- - 3 - 2 5 8^7 20— M 59 
91-35 - - = - 1 - H з n 18 и 
3640 - - - = = 1 1 $8.19 319-95 
41-5 - =- = = - 1 9! A 9 37 
Tot) 15 4 8 18 2 3 069 4 6 0171 47 
TABLE 3 
DISTRIBUTION OF SCORES IN READING AND WRITING 
(Rural stratum)* 
Reading Writing score — * Toll 
ore о а ООР 
0 п id 1 14 D M 
15 10 7 11-0 SM TT 
6-10 4 в 19^ 11^ 35/0098 181 9 ИОК 
11-15 4 4 4 117,42 9 10 18 10 15 
1-0 3 4 2 зи 9» 9» 3 ш ш 
21-5 1 з 1.1 3 ЩЩ LM MU 
906-0 - - 27. 7 0» ^74 и n A 
3-35 - 1 CMM 4 7 LE 
$6-0 - „2%ы з 6 7 9? a»; “u 
HELD 1 6 1 з 4 Mw 
Total 40 27 33 30 00 121 171 128 236 1% 1022 


Are now in process, . 


518 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


But another illustration of the limits can be given here. Taking into 
consideration only the errors in reading, it was found that the man 
with a university education takes on the average 6 minutes to get 
through all tests. In doing so he generally makes no errors and gets 
45 points. In other words, one point takes approximately 8 seconds. 
In the same way the score of 5 points in 10 minutes means an average 
of 120 seconds per point or 15 times slower reading than a man with 
university education. Adopting now the criterion that the score of 15 
or more times slower reading than our standard, i.e. 0—5 points, repre- 
sents the class “illiteracy,” then that 5-15 points or 15-5 times slower { 
reading defines the “middle literacy” and 15 points and over the class 
“literacy,” our data give the following distribution: 


Degree of Stratum 
literacy 
Rural Urban 
Illiteracy 6.7 8.6 
Middle literacy 21.6 27.0 
Literacy ТЫЛ 64.4 


To complete these data Figure 1 is also given. On the 2-axis is the 
score in reading and on the y-axis the percentage of persons having 
reached the respective score. This figure is characteristic for the 
problem of literacy. 

Consequently, the results of such research can be used to enlarge 


_ the knowledge of what is behind the general term “literate” people. 


CONCLUSIONS 


Data on literacy, as obtained by the census, probably are not very 
reliable in any country, although the degree of their value may vary - 
considerably in connection with specific conditions. The possibility - 
of influencing their reliability by means of better definition is also 


doubtful. 


There are two main problems in this field: i) the value of the answers 
of those absolutely literate and absolutely illiterate and ii) the quality 
of answers of intermediate cases, ie. of the people who are neither 
completely literate nor illiterate. In the first case the right answer 
depends mostly upon the system of enumeration and in the second. 
one upon the lack of a criterion as to the level of ability to write апо 
read at which literacy begins. To control the meaningfulness of data 
belonging to this second class, an experiment of the described sort is 
very useful. е х 


SAMPLING CONTROL OF LITERACY DATA 519 


% 
100 


—— RURAL STRATUM 


80 м. =--= URBAN STRATUM 


o 
о 


> 
$ 


PERCENTAGE OF PEOPLE HAVING 
REACHED А GIVEN SCORE 


№ 
о 


07 98 9 12 15 18 21 24 27 30 33 36 39 42 45 
READING SCORE 


Ета. 1. Decreasing degree of literacy. 


Some may agree to the usefulness of such a control but doubt 
might arise as to the possibility of carrying through a similar large- 
scale research. The problem may especially arise in connection with 
the willingness of the people to be tested. « 

Perhaps some words on our experience will be useful here. To carry 
on this experiment we used 250 young employees* of the statistical 
office who had been trained two to three hours a day during less than 
two weeks. They were charged with the whole field work in the appli- 
cation of sampling methods in connection with this census. Most of 
them had never had any contact with psychology and education, but 
in spite of that their technique of experimentation was considered by 
experts as very satisfactory. (t i 

On the other hand, we did not have any difficulties with people. 
Before these experiments started, our inspectors contacted respond- 
ents in connection with the control of the completeness of enumeration 
and the control of all answers in the census questionnaire. At this occa- 
sion they also had a cqntact with those people selected for this experi- 
ment and explained to them the purpose and the sense of this work. 
The result was that out of 1439 primarily selected persons only 14 
didn't come to the testing place. They have been replaced by the others 
selected at random as well. 

Tf this experience has any meaning for other countries, I should con- 
clude that the sampling contfof of thedliteracy data does not raise 
Serious difficulties, — * 


° . « 


RESPONSE ERRORS IN ESTIMATING 
THE VALUE OF HOMES* 


Leste Kisx and Jonn B. LANSING 
Survey Research Center, University of Michigan 


In the 1950 Survey of Consumer Finances home owners 
were asked to estimate the market value of their houses. Es- 
timates for these same homes were later made by professional 
appraisers. These two estimates for each of 568 homes com 
prise the data analyzed here. The proportion of discrepancies 
between the two estimates is great: only 37 per cent of the 
estimates by respondents are within plus or minus 10 per cen 
of the appraisers’ estimates. However, the errors tend to 
offsetting, and in none of the ten price classes used is the diff 
ence in the relative frequencies for owners and appraisers sae 
tistically significant. Similarly, although the root-mean-square 
difference between the two measurements is high (an avera 
of $3,100), the mean of the respondents’ estimates is only 
$350 higher than the mean of $9,200 for the appraisers’ es- 
timates. The amount of variability is found to be rather simix 
lar for several sub-populations. However, for houses wo 
over $10,000 the mean-square difference between the measure 
ments is found to increase with the value of the home. In the 
Appendix a model is developed for the statistical investigation: 
of the data, 


> 


INTRODUCTION 


Now mpax of the over-all financial position of consumers has bee 
primary objective of the Survey of Consumer Finances conduc! 
annually since 1945 by the Board of Governors of the Federal Rese 
System in cooperation with the Survey Research Center of the U 
versity of Michigan. М 
More than half of American families live in their own homes, 
for the vast majority of these families, that home is their most va 
able single asset. To be complete, then, any analysis of the finane 
position of consumers must cover this asset. 
.... In the 1950 Survey of Consumer Finances, respondents were 8 
to give their idea of what their house was worth. The answers they 
|... * The authors are indebted , wh 
ied ham eared ec be reese dd rt vin d Ra 


Piatistiee They are also indebted to the American Institute of Real Estate Appraisers, the Fed 
een né Administration, and the Society of Residential Appraisers for their participation in the 


1 For a discussion of the methods used in this sv х im 
Dent, *Methoda urvey вее G. Katona, L. Kish, J. В. Іам n 
ms of the Surveys of Consumer Finatioes," Federal Reserve Bulletin, 36 (1950) 


í 


520 


RESPONSE ERRORS IN ESTIMATING VALUE OF HOMES 521 


gave have been tabulated and on the basis of those replies tables were 
published in the Federal Reserve Bulletin? showing distributions in 
class intervals of owners’ estimates of the current value of their homes. 
The published distributions are for all owners, for owners having differ- 
ent incomes, owners with different occupations, and owners living in 
towns and cities of different sizes. This is important basic information 
for the student of housing economics. 

The question naturally arises, how reliable are these data? How 

^ much does the average householder know about the going market price 
for his house? Assuming for the moment that he does know, is his 
answer to the interviewer's question likely to be seriously biased? Are 
recent buyers of homes more informed about current market conditions 
than owners who may have bought many years earlier? 

It was in an effort to answer some of these questions that a special 
attempt was made to evaluate the responses given to questions con- 
cerning house values in the 1950 Survey of Consumer Finances. Re- 
spondents who reported they owned their own homes were asked in 
January and February 1950 whether they had purchased their homes 
in 1949 or in some earlier year. Those who had purchased before 1949 
were asked: “Could you tell me what the present value of this house is? 
I mean about what would it bring if you sold it today?” (A similar 
question was asked in the 1950 Census of Housing.) Those who had 
purchased their homes during the year 1949 were asked: “How much 
did the house and lot cost?” by 

Subsequent to the completion of interviewing, it was decided to 
check the estimates of respondents by obtaining estimates from quali- 
fied residential appraisers. Through the cooperation of the American 
Institute of Real Estate Appraisers, the Federal Housing Administra- 
tion, and the Society of Residential Appraisers, arrangements were 
made to have professional appraisers Ẹisit a substantial number of the 
properties. The appraisers were not required to obtain access to the 
property; they were asked to look at it from the outside and to estimate 
its value in the light of their experience and familiarity with local real 
estate conditions, 

From the sample of home owners found in the yearly survey a sub- 
sample was selected, including respondents who failed to answer the 
questions about home-ownership, but not including any potential re- 
Spondents who had not been interviewed during the regular survey. 


1J. A. Frechtling, J. H. Lorie and Irving Schweiger, “1950 Survey of Consumer Finances, Part Vy 
The Distribution of Assets, Liabilities, and Nat Worth of Consumers, Early 1950," Federal Rescrve 
Bulletin, 36 (1050), pp. 1595-07. s А 


522 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


(In the subselection a higher probability of selection was given to 
the more extreme house values.) The sample was distributed roughly 
evenly among the three participating organizations. The response rate 
in the follow-up study was 89 per cent. This high response rate, a result 
of the excellent cooperation on the part of these professional groups, 
made possible the analysis which follows. 

The number of homes used in the first stage of the analysis is the 637 
for which forms were returned. In 30 of the 637 cases the value of the 
property was not indicated on the completed form. In an additional 39 
cases the respondent failed to give a usable answer to the question in 
the original survey. Hence there are 568 homes for which two estimates 
of value are available. (In calculating the response rate of 89 per cent 
mentioned above these 69 cases were treated as responses since some 
useful information is available about them. If the 69 were classified as 
non-responses, the response rate would be 79 per cent.) 

Essentially, the analysis was divided into two stages. The first stage 
involved a simple comparison of the frequency distributions and cross 
tabulations obtained by the original survey and by the follow-up study. 
The second stage involved the statement of a mathematical model of 
the response error, and estimates of the terms of the basic equation of 
this model. Although the conclusions drawn in the second stage are 
described in the main bódy of this article, the model itself appears 
in the Appendix. 


COMPARISONS OF CELL FREQUENCIES 


The first step in the analysis was to compare the frequency distribu- 
tion obtained from the survey of owners with that from the survey of 
appraisers. The results of the comparison appear as the first two colums 
of Table I. Columns (3) and (4) show cumulative totals for columns 
(1) and (2), respectively. А 

The fifth column of Table I shows the distribution of appraisers’ 
estimates for the 39 cases which were “Not Ascertained” in the survey: 
On seeing the “NA’s” in any table one is led to wonder about their 
effect on the entire distribution. There is a mere suggestion of а соп- 
centration of several homes with very low valucs among these 39 cases. 
But anyone who assumed that the 39 cases should be distributed pro- 
portionally would not have been led far astray. 

Tn column (6) the differences between the entries of columns (1) and 
(2) are given. These differences are subject to sampling variability: 
Tf we sent out both interviewers and, appraisers to repeated samples of 
600 cases under identical coriditions, we would expect that the bracket 


стри р MU m CXII 


RESPONSE ERRORS IN ESTIMATING VALUE OF HOMES 523 


distributions sometimes would show closer agreement than (1) and (2), 
and sometimes wider disagreement. The model in the Appendix per- 
mits us to estimate the probability that the proportions in any pair of 
cells will agree within a given range. That is, we can estimate how the 
differences shown in column (6) would fluctuate if the present study 
were repeated many times. The measures of this fluctuation, estimated 


TABLE I 


FREQUENCY DISTRIBUTIONS OF THE VALUE OF OWNER- 
OCCUPIED HOMES BASED ON ESTIMATES REPORTED BY 
OWNERS AND APPRAISERS (UNCORRECTED)* 


(percentage distribution of homes) 


r Appraisers’ 
SR Cumulative/cumulative) Estimates | Difference cr 
Value of Home |. ena : |АРРГ ЕГ p on де и Vere ЧЫ 
1 poni p , П 
Estimates | E4468 б Appar pondente' Proportions Diferenca 
Estimates | Estimates. Were Not | (1)-(2) in (6) 
|Ascertained 
а) @ (3) (4) (5) (6) (7) 
Under $2,500 2.9 2.8 2.9 2.8 14 +0.6% | 0.7% 
$2,500- 4,999 13.1 13.7 16.0 16.0 14 —0.095 | 1.4% 
$5,000- 7,499 19.6 19.8 35.6 35.8 20 40.895 | 1.9% 
$7,500- 9,999 21.5 24.8 57.1 59.6 18 —2.8% | 1.9% 
$10,000-12,499 19.1 16.8 76.2 76.4 * 7 42.895 | 1.8% 
$12,500-14,999 6.5 8.8 82.7 85.2 10 —2.3% | 1.2% 
$15,000-19,999 7.2 6.3 89.9 91.5 3 40.970 | 1.1% 
$20,000-29,999 2.8 2.2 92.7 93.7 з °| 40.6% | 0.7% 
$30,000 and over 1.5 1.4 94.2 95.1 3 +0.1% | 0.4% 
Value not ascer- 
tained 5.6 4.7 99.8 99.8 8 +0.9% 1.2% 
Total 99.8 99.85 100 
Number of homes| 637 637 637 637 39 


Tex), PRESTR EL ee cae 
inn, These “uncorrected” distributions contain cler errore which were discovered and Sees 
in the course of comparing the data from respondente and appraisers, Later tables are based on correct- 
ed data except as indicated. 

P Detail does not add to 100.0% owing to rounding, 


from the data of this study, are presented in column (7) in terms of the 
standard errors of the differences. We may illustrate the interpretation 
of these columns as follows: the discrepancy between the proportion of 
homes placed in the bracket $2,500-4,999 by respondents and ap- 
praisers was 0.6 per cent in the present study; if the study were repeated 
many times, this difference would be less than 1.4 per cent in two stud- 
les out of three in the long run, and it would be less tham2.8 per cent in 
19 studies out of 20. e° 


524 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


The two distributions in (1) and (2) convey the same general impres- 
sion about the proportion of owner-occupied homes of different values, 
The same would be true of other similar distributions from replications 
of the present study in view of the relatively small size of the errors 
shown in (7). We make this judgment (and ask the reader to do like- 
wise) within the general framework of the errors and requirements of 
surveys of this kind and size. It would be fruitless for us to raise here 
the question: for what kind of decisions aré our results “reliable 
enough”? Our investigations do provide assurances against the exist- 
ence in the procedures of large response errors. “Large” here is taken in 
the context of the actual sizes of the sample and of the sampling 
errors—but we must neglect the question of the relative cost of reducing 
the response error. 

Although we find no reliable evidence of a net bias in any price class, 
it is possible (and even probable) that a large enough sample would un- 
cover biases which escape detection in this sample. We have shown 
only that the differences between columns 1 and 2 could be the result 
of random response variation. 

The second step in the analysis was to examine the discrepancies 


between the estimates of respondents and appraisers. The similarity — 


between the first two columns in Table I could be the result either of 
few errors or of many off-setting errors. Table II compares the classifi- 
cation of the homes by respondents and appraisers. A sum of the pro- 
portions in the colls along the diagonal indicates that 43 per cent of the 
homes that were included in a given bracket by the respondents were 
also placed in that bracket by the appraisers. Errors were, in fact, 
frequent, but generally off-setting. 


On close examination some of the differences shown in Table I | 


seemed out of all reason—kow could any house valued by a respondent 
at under $2,500 be valued by au appraiser at over $15,000? This ques- 
tion raises the possibility of errors in the survey process made by others 
than respondents and appraisers. The information in Table II was 
to guide a special search for errors. All cases where the two estimates 
were in disagreement by more than one “bracket” (coded class of 
house value) up to a value of $15,000, and ahove that value all cases 
not in the same “bracket,” were selected for study. 

"This search involved a comparison of the original interview, the 
appraiser's report, and the card on which the data had been punched. 
Such a search is unlikely to turn up errors by interviewers in recording 


the answers given by respondents, but it should disclose any errors 2 | 


D 


RESPONSE ERRORS IN ESTIMATING VALUE OF HOMES 525 


coding. An examination of 109 cases yielded 17 errors, all but two of 
them clerical errors by coders. Of the 17, four involved only errors in 
the conversion of а dollar amount (entered correctly) to а bracket 
(entered wrongly). There were 11 clerical errors made in coding the 
respondents’ estimates, and two errors were made by interviewers. 
Ten of these 11 errors involved entries of one-tenth of the proper 
amount owing to the omission of a zero; in the one case, $11,000 was 
read as $77,000. In addition to these errors two exceptional cases were 
TABLE II 
RELATION BETWEEN APPRAISER'S ESTIMATE AND 
RESPONDENT'S ESTIMATE (UNCORRECTED)* 


(percentage distribution of homes) 


Respondent's Estimate 
Appraisers Value 
Estimate Under $2,500 $5,000 $7,500 $10,000 $12,500 $15,000 $20,000 $30,000 not т 
$2,500 —4,099 -7,99 -9,099 -12,499 -14,000 -19,000 -20,999 & Over ascet- 
tained 
‘Under $2,500 1.0 0.4 0% от 2.3 
$2,500- 4,999 | 0.7 7.2 38 11 0.2 0.7 18.7 
$5,000- 7,499] | | 34 83 48 17 i1 19.8 
$7,500- 9,999 | [0.2] 0.5 5.0 M1 5.7 02 «06 10 243 
$10,000-12,49 | (0.4 оз 38 278 24 15 02 04 16.8 
$12,500-14,999 | (0.2 o: oi 23 30 13 06 DH 06.88 
$15,000-19,000 | 0.2] — 09 07 зз 1Q 01 02 989 
$0, 0-29,999 | —— |o.2| 08 ов 04 023 2.2 
$30,000 and over NET 0.2 02 08 Of 14 
Value not ascer- 
tained 03 14 13 04 0.6 02 0l Q1 05 47 
Total 1s ni mne si NA MN o Bae) SS eee 
Number ofen] з so в 109 ш 39, 5  % $9 687 


the corrections on this table is to empty the cells indicated by boxes, distributing the entries among 
cells, The table reads as follows: 1.0% of all houses in the sample were valued ‘at under $2,500 by the respondent p 
also by the appraiser; 0.4% of all houses were valued at $2,500-$4,009 by the respondent, but at undor $2,500 by 
appraiser; eto. - 
b 
These two cells contain one case apiece. They are the exceptional cases noted in the text. 
© Because there were three different right used, Che percentages are not simple ratios of the total of 627 


е 
noted. In one, it seemed clear that the appraiser had included only part 
of the property which the respondent had in mind. In the other, the 
appraiser based his estimate on the commercial value of the property, 
while the respondent based his on the value for residential purposes. 
The effect of these errors on the entries of a few cellsin Table II are 
LJ 


. 
c f 


526 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTE 


Shown: certain cells which are emptied by the corrections or: 
contain only exceptional cases, have been indicated by being 
in “boxes.” All of the most extreme discrepancies in Table II | 
pear but the marginal distributions are little changed by the c 
tions. It is interesting to note that the lowest class was compo 
large part of errors. 

The comparison in Table II is supplemented by another appro 
in Table III: this presents the distribution of each respondent/s k 


TABLE III 


FREQUENCY DISTRIBUTION OF RESPONDENT'S ESTIMA’ 
DIVIDED BY APPRAISER’S (IN BRACKETS)* л 


Respondent’s Estimate Divided 


by Appraiser’s Proportion of Homes 


Under 70% 6 
70- 89% 20 
90-109 % 37 

110-129% 19 

130-149 % 9 

150% and over 9 

Total 100 
Number of homes 568 


mate divided by the appraiser’s estimate on his home. This di 
was carried out for 568 homes. The respondents’ estimates were wi 
plus or minus 10 per cent of the appraisers’ in 37 per cent of 
cases. On the other hand, the discrepancy was more than plus or 
30 per cent for 24 per cent. Of these 24 per cent, 18 per cent 
sent overestimates by respondents, suggesting a tendency for owners! 
overvalue their homes. This possibility can be better evaluated. 


comparing the means of the two distributions (after correction of | 
clerical errors). $ 


[rta finding and the $77,000 mistake have obvious implications for checking proe 
noted that the checking procedure used in processing the 1950 Survey of Consumer 
varied according to the nature and extent of the projected analysis of the data. The data on 
oases Teceived the minimum amount of checking. For the type of distribution actually publi 
е Federal Reserve Bulletin these clerical errors were c1 little importance. The clerical errora do 


a large effect on the errors of С ther intended 
submitted for publication, the estimated mean value; however, the mean was neither 


RESPONSE ERRORS IN ESTIMATING VALUE OF HOMES §27 


COMPARISONS OF THE MEANS OF THE TWO DISTRIBUTIONS* 


The difference between the means obtained by the two methods of 
measurements is $9,560— $9,210 =$350. That is: the mean of $9,560 
obtained from the responses of the home-owners seems to include a 
bias of $350 (if we accept the appraisers’ values as “true”). This bias 
is in the direction one would expect. The standard error of the differ- 
ence was calculated (by a formula proper to the complexities of the 
sample design) to be $170. Hence there appears to be a tendency (sta- 
tistically significant) for the home-owners to set higher values on their 
homes than do the professional appraisers. This tendency is small 
compared to the value of the home—about 4 per cent of the latter. 

This net average bias may appear small also in comparison with the 
large discrepancies found in the two values obtained for individual 
homes. The mean square difference of the two measurements was esti- 
mated as 9,580,000 compared to the estimate of the squared bias of 
100,000 (see equation 15 in Appendix). This result is consistent with 
the findings presented in Tables I and II which show also large dis- 
crepancies in individual estimates but small differences in the overall 
distribution of the two measurements. 

The relative importance of a bias depends on the size of the survey 
to be taken. The sample mean of a simple random sample of т inter- 
views with respondents may be expected’ to be subject to a total 
root-mean-square error of /[V(r)/n]H-D* where the first term under 
the radical represents the total variability of the estimates from the 
survey of respondents about their own mean and the second, the square 
of the bias.5 As the size of the sample increases the first term will de- 
crease but the second will remain constant. 

We may use the sample estimates obtained in our investigation to 
examine the effect of the bias on the jotaf error. For V(r) we have the 
estimate »(r)-32,050,000; and fot 2° we have the estimate d* 
= 100,000 (see equation 15 in Appendix). Now let us take the value of 
V[V()/n]--D? for three different, sample sizes, and under the two 
assumptions: that D?= 100,000 and that D*=0. 


4 We include this analysis because it may be of general interest, We repeat: the mean od i 
was not sought nor published in the original survey. 

V The term V()/n represents the variance of the sample menn ва it is usually lees 
Actually includes both the error resulting because not every member of the universe was i bch 
the sampling error proper ана any uncorrelated random response error which may be nns 
the methods used, euch as random clerical errors (see equation 7 in Appendix), The ты Siren B s 
response errors will Ub reflected n the squared bins term, Dt. This expression shows thet it is possible 
to increase the accuracy of the estimate of a mean from a simple random sample in one of three ways: 
by increasing the итше of Daerevs Qnoesshg n) by RENE Uw ЧЫК of Un СО 
V(r), (by reducing some of the error of responde); or by reducing the sae of tho Diss eer 
by more careful training of itfterviewers). The practical problem in the arse Eid 
allocate resources among the three in such a way as to minimize the total error for a given outlay, 


е Lj « 


528 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPT 


The total (root-mean-square) error of the mean house valu 
six different conditions would then be as follows: 


Value if Sample Size 


Value of D? Formula 
100 1,000 


(1) ae hes 32,650,000 | 49,000 | $650 $360 
, n 

(2) Теш enit | / 32,650,000 $570 — $180 
n 


Note that the total error for a sample of 100 is not greatly inen 
by the bias term, but, for a sample of 1,000, the effect of the bias 

is large, while for a sample of 10,000 it is overwhelming. Whi 
facts are similar to those found in this investigation, an impro еш 
in the accuracy of the estimate for surveys of a few hundred cases 
probably be obtained most easily by increasing the size of the s 
For surveys of several thousand cases, however, it may be more 
cient to allocate funds for a search for sources of bias and for de 
ment of techniques for reducing the bias than to allocate funds f 
increase of the size of the sample.’ 

We have investigated the possibility that the discrepancy bet 
the two measurements might prove to be a function of the value o 
house. One can imagine, for example, that respondents might 
overvalue low priced homes and to undervalue high priced homes: V 
divided the homes into groups based on the appraisers’ value and e 

mated the mean value of the houses in each group, first on the 
of the respondents’ values for the houses and then on the basis 0 
appraisers’. The difference between these two means has been ple 
in Chart 1. (See solid line “A.”) This graph indicates that the 

ents tend to overvalue homes priced below about $12,000. For 


priced above that amount no clear tendency to under- or over-valus 
appears. We feel that the discrepancy below $12,000 may be expla 
in part, but only in part, by errors in the estimates made by appr 
Any such errors also would tend to give this graph a general 1 
downward to the right.” 


_ From the data of Table 
less for estimates of the pi 


timate, However, if 
X =$10,000, Y =$1, 
in the other directi. 


RESPONSE ERRORS IN ESTIMATING VALUE OF HOMES j é 529 


THE DIFFERENCES OF TWO MEASUREMENTS (r-a), TAKEN AS A 
FUNCTION OF THE APPRAISER'S VALUES, 


A) The difference of class meons: (7 - 0) 
B) The root- meon- square differences, 1. e, the estimates of 
E(r-o)? ы 


The values of the sample estimates оге given in units of thousands 
of dollars for both A) and B). 


Thousands of 


22 24 37 32 4761 48 40 4329 38 17 182 


Number og sample cases in class of 


530 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 19M 
THE ROOT-MEAN-SQUARE DIFFERENCE 


As a measure of the average individual discrepancy between respond- 
ents and appraisers we use the root-mean-square difference, that is, 
the square root of the mean of the squared deviations between the 
pairs of estimates. We estimate this quantity at $3,100 for the sample 
as à whole. In other words if we assume that the appraiser's estimate 
is the true value, the respondents are in error by an average of $3,100. 
in their estimates. (From equation 15 in Appendix.) Actually there is 
no doubt that the appraisers also made errozs, and the average dis- 
crepancy between the respondents’ estimates and “true value" of the 
property would be less than $3,100. 

How does the average discrepancy vary with the value of the home? 
Is the discrepancy a constant amount, or a constant proportion of the 
house, or some other function? On Chart I are plotted the root-mean- 
square differences—the r.m.s.(d) values—for each class of appraised 
values; the width of the intervals is $1,000 except at the ends, where 
classes were combined to obtain larger cells. (See the solid line “B.”) 
For values below $10,000 the r.m.s.(d) appears to be constant around 
$2,000. For values above $10,000 it is considerably more variable and 
larger; and, it appears to be proportional to the estimated value of the 
home. The line which represents a root-mean-square difference of one- 
fourth of the appraised value is drawn in. It appears to the eye to fit the 
distribution above $10,000 fairly well. In other words in our data the 
expected absolute value of the difference between the respondents’ and 
the appraisers' estimates is about $2,000 for a house worth less than 
$10,000; while, for a house worth over $10,000, the expected value of 
the difference is one-fourth of the appraisers’ estimate. For a $16,000 
house, one would predict a respondent would differ from an appraiser 
by $4,000; for a $20,000 house, one would predict $5,000, and so forth. 


ANALYSIS OF SOME SUB-GROUPS 


One aim of our investigation was to discover some of the variables 
which might be associated with response errors. For three cross-tabula- 
tions comparisons were made of the ratio of respondents’ to appraisers’ 
values. An attempt was made in the original survey to isolate those 
cases where the respondent seemed uncertain of his estimate. If this 
attempt were successful, it was thought that it might be possible to 
develop methods of analysis that would place more weight on the more 
ud cases. The procedure tried was to instruct the interviewers as 
ollows: 


aa 


Since some respondents have a very clear idea of tlie value of their house, 


RESPONSE ERRORS IN ESTIMATING VALUE OF HOMES 531 


based on such things as what the house next door just sold for, while others 
have only very vague notions, we have left space after question 31 in which 
you should note down any information he may give you about how he ar- 
rived at his estimate of the value of the house. Our objective is to distinguish 
between cases where we have the kind of accurate estimate we would prefer 
and cases where we have only vague information. In any case be sure to 
record the dollar value of the house. 


The coders were then instructed to study the answer as recorded by the 
interviewer and attempt to assign a rating according to how sure the 
respondent seemed to be of his answer. This rating proved very difficult 
to make; coders disagreed frequently as to the proper point on the scale 
at which to place an answer. The relevant data (not given here) show 
that the assigned rating of the appearance of reliability had no validity : 
the errors were about equally large in the various classes of assigned 
reliability. : 

Secondly, occupation of the head of the family owning the house was 
selected as a measure of socio-economic status, on the hypothesis that 
people of higher status might be better informed. Thirdly, the popula- 
tion of the place (city) of residence of the respondent was selected on 
the hypothesis that knowledge of real estate values would be different 
in communities of different sizes. None of these hypotheses were sub- 
stantiated; no sizeable differences were noted. 

For four subgroups of the sample we caltulated separately the esti- 
tnates of our basic error equation (8). There exist a priori reasons why 
the accuracy of the estimates in each of these groups might turn out 
to be different than in the entire sample. The calculated equations are 
in the Appendix; here we shall summarize the results, using the root- 
mean-square difference—r.m.s.(d)—as the measure of accuracy. The 
conclusions we draw from these groups must be tempered by the knowl- 
edge that they were not properly selected subsamples of the entire 
sample; hence there may be other*¢auses operating beyond that on 
which we focus our attention. 

a) In 65 cases the appraisers exceeded the minimum effort asked 
of them and went into the homes. We expected that their estimates 
would be more accurate, and that the r.m.s.(d) would be smaller, How- 
ever, the r.m.s.(d) forthese 65 cases turned out to be $2,700 compared 
With $3,100 for the entire sample. The appraisers’ errors were not 
clearly increased by remaining outside the house. К 

Б) In homes purchased during the calendar year prior to the inter- 
View, the respondent was asked what he actually paid for his home. 
We expect that the reports of ,the respondents were fairly close ap- 
Proximations to the true value at the time of purchase. The r.m.s.(d) 


532 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


of $1,900 is reliably smaller than the $3,100 for the sample as a whole, 
One should not infer, however, that the entire $1,900 is the result of 
errors by appraisers. For one thing, real estate values change with 
time, and up to a year might elapse between the purchase and the 
original interview, with several months more passing before the visit 
of the appraiser. 

c) In the Surveys of Consumer Finances interviewers are instructed 
to make efforts to interview the head of the household rather than some 
other member. In the 91 cases where the interviews were taken from 
some other member of the household the r.m.s.(d) was not—contrary 
to expectations—larger than for the sample as a whole. In fact it was 
$2,500 as against $3,100 for the entire sample. 

d) There were 59 cases where the head of the house was a female. 
For these the r.m.s.(d) was $3,900, which appears to be reliably higher 
than for the entire sample. 

The only important improvement in accuracy, then, was for re- 
spondents who purchased in the year prior to the survey. These re- 
spondents, as noted earlier, were asked what the property actually cost 
rather than their estimate of what it might be worth, hence, it is not 
surprising that their responses are close to the appraisers' estimates. 


APPENDIX 


The Model. The symbol r; denotes the value recorded at the i^ home 


as а response in the interview survey; and a; denotes the value assigned 
by the appraiser to the same home. The “true” (but unknown) value 
is у. Where there is little room for misunderstanding we shall drop the 
subscript i, and refer simply to r, a and y. The means over the entire 
population for the three sets of values may be designated by: 


R=E), A4-EXa, Y-E(). (1) 


The operator “E” denotes the “expected value of."* The variances of 
the three variables may be designated by: 


V(r) = E(r— R)’, Va) = Ela — A), V(y) = E(y - Y. @ 


* The means of the measurements r; and a; over a finite population would be variables also due to 
the errore of measurements, But we may treat Ё and A as constants if we consider them as resulting 
from a large number of reported measurements, or as coming from a large population. By confining 
ourselves to large populations we may also disregard any “finite population corrections" in our variante 
formulas, The terms used here are generally in accord with those in: M. H. Hansen, W, №. Hurwit 
and W. G. Madow, Sampling Survey Methods and Theory (New York: Wiley, 1953) П, Chap. 12. —. 
> _ Another good treatment of the topic of errors of response may be found in W. G. Cochran, Sampling 
Techniques, New York: Wiley, 1952, Chap. 13. 

However, none of the sources known to us develop the model we need in terms of the differences 
(r—a) of two sets of measurements, both subject to error. To what extent these non-sampling errors 
may be considered to be random variables ir а complex problem which we shall have to leave untreated: 


1 
| 


RESPONSE ERRORS IN ESTIMATING VALUE OF HOMES Ben Bea 


The quantity (r:—y;) denotes the individual error of the response in 
the interview survey for the 7 home; and (a;—y:) denotes the error 
in the appraiser's estimate for the same home. The difference between 
the two errors is equal to the difference of the two measurements: 

d; = (r: — y) — (a — 9) = (rs — a). (3) 


Furthermore, let us call the mean value (K—Y) the response bias; 
' (A-7) the appraiser’ bias; and the difference between the two biases 


is | 

D=(R-4)=(R-Y)-G@-Y). (4) 
An important term in our model is the mean-square difference of the 
measurements: 


M.S.(d) = E(d*) = E(r — а)?. (5) 

We also need the expression for the covariance between the differences 
in measurements and the appraiser's values: 

Cov(da) = E(d — D)(a — A) = E(r - a - К+ 4)(a — А) (6) 


do: = E(r — a)(a — A), 


Cov(da) = E(r — R — a + A)(a — A) 
= Bir — ®)(а — Я) — E(a — A)?  Cov(ra) = V(a). (69) 
With the above definitions, we may express the basic equation for 
our empirical investigations: * 
V(r) + D? = V(a) + M.S.(d) + 2Cov(da). (7) 
For proof express E(r— 4)? in two different ways: 
Ет — A)? = Er —-R+R- A)? = V(r) + D? 
and А 
(Er — Ду, = E[(r а) + (a — DS М8.(@ + У(а) + 260у(40): 
Our model would be simpler if the appraiser gave the “true” value 
for every home, so that a;=y:; and the error equation would become 
Y() + (R — Y? = Vy) + Er — 0° + 2Covlr — 0000): 
Here Y (y) is the “true® sampling variance, i.e., the variance among the 
Yi, which are the “true” values of the homes; and 
Vir) + (Ё— Y - Vy) =£e-—y t 2Cov(r — v)()- 
is the increase in the total mean-square error due to errors of measure- 


ment. Similarly, the increase in the total mean-square error, due to 
the lesser accuracy of the ri thân the a; may be measured a8 


° „ ‹ 


534 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEM 
V(r) + D? — V(a) = E(r — a)? — 2Cov(r — a (a). 
It is also interesting to note the relationship | 
E(r — a)? = E(r — y? + E(a — y)? — 2E(r — у)(а — 9). — 
The covariance B(r—y) (a—y) of the two measurements may be posi 
or zero, but it is not likely to be an important negative quantity i 
present instance. Therefore, the term E(r— a)? available in this 
is likely to be larger than the mean-square error of response E(r 
by a quantity no greater than (but perhaps almost equal to) the 
square error E(a—y)* of the appraiser’s measurements. 
Although the results were obtained from a complex multi-stage 
ple, the discussion is given in terms of the composition of the respo 
error for the individual homes which are the ultimate elements 
prising the population. The expressions of the relative effects of 
bias and of the variable error are given in terms of simple rand 
samples. It is hoped that in this form the data will be of greater gene 
interest and usefulness in planning other surveys. The calculations are 
based on the “naive” estimates from the pooled sample values; gre 
refinements did not seem to be warranted by the available data.” 
The basic relationship shown in (7) may be expressed in terms | 
sample estimates ав 


v(r) + d = v(a) + m.s.(d) + 2cov(da). 
We have the following unbiased estimates: 


К» 12 
F=—)ir, ā=— У а, 
n n 


1 n n 
W= 0060-0, = Saas, 
1 
n-1 


X (r- (а-а), 


cov(ra) = 


"For the benefit of future researchers we should like to point out that an estimate of 
could have been obtained had we assigned some of the homes to two appraisers each; we thought ofi 
too late to carry out the necessary field work. 
19 For the same of simplicity and because we have no measure cf it, we disregard the correlati 
Among the errors of individual homes, such as may be caused by interviewer bins. 

7! Because the responses were “weighted” to correct for the use of different sampling та! 
Actual sample calculations were somewhat different from those shown here. For example, 7 


n л 
culated as Хут; / ооу, where rj is th i i i th home 
‚ у is the response, and wyj is the assigned weight of the jt ў 
sample. The calculation of the variances may be illustrated by 4 


a 


on—1 


n : Y] 
VO) =— -„—т Zwj(rj -»*. 
m 


RESPONSE ERRORS IN ESTIMATING VALUE OF HOMES 535 
cov(da) = cov(ra) — v(a) (12) 


1 12 
m.s.(d) = 21d (r — а)? 


-„[Ў=+®е- т]. (13) 


Although (7—4) is awunbiased estimate of D, (r—à)* is not an unbiased 
estimate of D?; but d? is an unbiased estimate where 


Ë = (F = 0) – i [v(r) + v(a) — 2cov(ra)]. (14) 


This happens because 
EG — a)? = Е[(# — R) — @- A) + (Е – A) 
= V(A) + V(à) — 2cov(Fa) + ТУ. 


The unbiased estimate Ф has some advantages: together with the 
other unbiased estimates from the sample, it yields values for our error 
equation (8) which balance out exactly. However, it also has some 
disadvantages: it is a residual of sample values and it turns out to be 
negative sometimes—an embarrassing situation for the square of a real 
quantity. (One may decide to truncate the distribution of @ at zero 
by substituting the value zero for all negative sample estimates. Al- 
ternately, опе may use simply (7—4)? with the knowledge that it has 
a positive bias of known magnitude.) 


SOME CALCULATIONS ON THE DOLLAR VALUE OF THE HOUSE 

The five terms of the basic equation (8) of the estimates of error com- 
ponents will be presented in this section? for several situations, They 
will be given in units of $1,000; sincé'in these variance components the 
units are squared, a factor of 10° is needed to convert them to plain 
dollar values. } 

1) Our principal interest is in the components of the equation dealing 
With all the 568 cases: 

32.65% .10 = 26.69 + 9.58 — 3.52. (15) 

Note the relatively large m.s.(d) term which yields the 4/9.58» 10* 
= $3,100 estimate for the r.m.s.(d) between the two measurements on 
individual horñes. But most of the discrepancies cancel leaving а much 
smaller net average error; the unbiased estimate ef this bias is 
V.10X10°=$320. , °° А 


536 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTE 


Д 


The total root-mean-square error of the responses 
=/32.65X108=$5,700 which is not much over th 
—V20.69X105—$5,200 we would get from appraisers’ 
Therefore, the practical surveyor may well be satisfied with 
cision of the interview response—if the bias term is not too 

The difference between the two variance terms is reduced 
sizeable negative covariance term (—3.52X 10°) between the diff 
in measurements (r—a) and the appraisers’ values. This is proi 
in part to overestimates among the appraisers' high values and 
estimates among their low values. The negative covariance is in 
with the gentle negative slope of curve A on Chart I. | 

2) If we allow the 13 gross coding errors (mentioned e 
stand uncorrected, the components are estimated as 


37.35 + .04 = 26.69 + 15.81 — 5.11. 


Thus over a third of the original m.s.(d) term of 15.81 was due 
13 gross errors, However, the rise in the variance of the res; 
more moderate (32 to 37). Moreover the estimate of the sampl 
may be no worse off for these errors (ironically) because the bias 
seems to be somewhat reduced. It seems that all these gross 
were in the direction of lowering the home-owners’ estimates а 
noted above, home-owners suffer from a tendency to overestimal 
value of their homes, f 
The effect of these coding errors on curve (A) in Chart I is ton 
the (7—4) values for the classes above $10,000 more depressec 
more irregular. Curve (B) of the rms. (d) values is also distu 
above $10,000: the curve becomes more irregular and the slope bec: 
greater (it seems to fit the line of V E(r—a) —a/3). 


3) For the 65 cases where the appraiser went into the home the 
ponents are 


37.07 — .09 = 34.31 + 7.29 — 4.62. 


4) For the 61 cases where the response was in terms of the a 
paid for a recently purchased home we have 


23.14 — .05 = 21.76 + 3.07 — 2.34. 


If we assume that the respondents gave the “true value" of their 
in these cases then we may accept this m.s.(d) term of 3.67 аза 
estimate of the appraisers’ contribution to the discrepaacy term. 


5) For the 91 cases where the respondent was not the head 
household the v. lues are a 


RESPONSE ERRORS IN ESTIMATING VALUE OF HOMES | 537 
37.04 + .28 = 28.77 + 6.29 + 2.26. 


6) For the 59 cases where the head of the household was a female 
the equation is 


44.78 + .33 = 28.10 + 15.17 + 1.84. 


RESULTS ON PROPORTIONS 
When we deal with the proportion of cases which fall into any class 
interval our variables are, binomial. The values of r; and a; are restricted 
to 0 and 1; and the value of d;— (т,—а;) is either 0, +1 or —1, The 
basic equation (8) of the estimated error components becomes: 


1 
Е par id [opa pa агра, 20-р) | s 
n—1 n—-1 


n 2n 
[pea] + [pc 20. 5 Е (Puppe pat |; (16) 
n—1 n—1 

Here p, is the proportion of the homes placed into a specific frequency 
group by the responses to the interviews, while pa is the proportion 
placed into that group by the appraisers’ estimates. Also Pra is the 
proportion placed into the same group both by respondent and by 
appraiser, Furthermore, q, —1— p, and ga=t—pa. The equation for the 
5,000-7,499 group would be, as read from the values of Table II: 


637 1 b 
E (196)(808) | + [co = .198)*~ c [(.196)(.804) 


-+ (.193) (.807) —2(.083) +-2(.196) (.193) | 


M [s aedi] [ 196+.198 20.089) 


m [2 { 088) ~(106).198)~(198)(.807)}] 

In Table IV, columhis (1) to (5), we present the estimates of the five 
components of equation (16) for each of the classes shown in Table I. 
In column (6) we show the difference (p,—Pa) between the proportion 
assigned to eaeh bracket in the surveys of respondents and appraisers. 
In column (7) we show the standard error of each difference shown in 
column (6). se 


g 
° 


538 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 194 


TABLE IV 


VALUES OF THE TERMS OF THE ERROR EQUATION (16) FOR THE 
PROPORTION IN EACH OF THE FREQUENCY CLASSES 
AS SHOWN IN TABLES I AND II 


(6) m 
Difference | Standard 
Values of the Components of the Error Equation | Between Error of 
sgun: Group. Proportions| the 
a) 2) @) (4) (5) Found | Difference 
vr) + @ = v(a) +m.s.(d) +2cov(da)| (pr—pa) (pr —Pa) 
$0- 2,499 .0282 — .0000 = + .0320 — .0263 | +0.6% 0.7% 
$2,500- 4,999 +1140 — .0002 = + .1240 — .1286 | —0.6% 1.4% 
$5,000- 7,499 +1578 — .0003 = + .2230 — .2215 | +0.3% 1.9% 
$7,500- 9,999 21690 + .0004 = + .2360 — .2508 | —2.8% 1.9% 
$10,000-12 ,499 .1548 + .0002 = + .2080 — .1880 | +2.38% 1.8% 
$12, 500-14 ,999 .0609 + .0004 = + .0930 — .1121 | —2.3% 1.2% 
$15 ,000-19 ,999 .0669 — .0000 = + .0710 — .0632 | +0.9% 1.1% 
$20,000-29,999 .0273 — .0000 = + .0340 — .0283 | +0.6% 0.7% 
$30,000 and over +0148 — .0000 = + .0130 — .0120 | +0.1% 0.4% 
Not ascertained .0530 — .0001 = + .0930 — .0850 +0.9% 1.2% 
Two Illustrative 
Cumulated Groups: 
$0- 7,500 +2296 — .0003 = .2287 + .2000 — .2084 | +0.3% 1.8% 
$0-10,000 +2454 + .0008 = .2412 + .2180 — .2085 | —2.5% 1.8% 


Note that the m.s.(d) terms, denoting the variability due to the dif- 
ference of the two responses, in column 4 of Table IV are large; genet- 
ally they are as large as, or larger than, the v(r) and v(a) terms which 
ordinarily stand for sampling variability—shown in columns 1 and 3. 
One may be tempted to assume that this variability would be much 
less if larger groups were investigated; however, the two larger groups 
shown on the bottom two lines of Table IV, comprising respectively 
about 35 per cent and 60 per cent of the population, also have m.s.(d) 
terms almost as large as the (т) and v(a) terms. 

In spite of the large m.s.(d) the value of [v(r)--d?] is hardly any 
larger than v(a). This is due to the large negative covariance term. 
That is: there exists a large gross response variation but its net effect 
on variability is very small. 

The net effect in terms of bias is even smaller. There is no bias term 
in column 2 which is reliable in terms of the standard error. If we aver- 
age the ratios of the d? values to the respective v(r) values over the 10 
classes we obtain .0005. In the calculations on the dollar mean the ratio 
of @ term to the v(r) term was .0030, Thus we may say that the bias 
term for the proportions remains undetected; and if it exists its effect 
on its total error is probably less than in the case of the dollar mean. 


y` 


A COMPARISON OF STRATIFIED TWO-STAGE 
SAMPLING SYSTEMS 


А. В. Sun, Uttar Pradesh, India 
R. L. ANDERSON AND А. L. Finxner, North Carolina State College 


This paper deals with an empirical investigation of various 
stratified two-stage sampling systems for estimating totals of 
certain agricultural items of North Carolina, The 1940 Agri- 
cultural Census data were used for stratification, selection and 
estimatioh purposes. The observed data were the results of 
the 1945 Agricultural Census. Theory for the selection of n 
primary sampling units from a stratum with probability pro- 
portional to some measure of size but without replacement. 
has already been developed by the senior author [11]. The 
principal contribution in this paper is the application of this 
theory to the selection of two primary sampling units without 
replacement from a stratum, where one of the units is selected 
with probability proportional to size and the other with equal 
probability. These results are compared with sampling systems 
(2) where both units are selected with probability proportional 
to size but with replacement and (i?) where an equal number 
of primary sampling units are selected but only one from each 
stratum. 


1. INTRODUCTION ^ 


HE theory for selecting & single primary sampling unit (p.s.u.) 

per stratum with probability proportional to size (p.p.s.) in two- 
stage designs was developed by Hansen and Hurwitz [1] in 1943 and 
was applied to human populations, The theory for selecting more than 
one p.s.u. from each stratum with p.p.s. but with replacement was 
developed by Hansen and Hurwitz [2]. Theory for selecting two p.8.u.’s 
without replacement has been developed independently by Midzuno 
[8], Horvitz and Thompson [3], Narafn [9], and Sen [10]. Both Midzuno 
and Sen generalized the Hansen and Hurwitz approach to sampling 
а combination of n elements of the universe with probability propor- 
tional to some measure of size of the combination. Sen [11] further de- 
rived an expression for an unbiased estimate of the variance of the 
estimate. The theory*thus developed has been applied to four items 
of the North Carolina (N.C.) agricultural population. Results of the 
investigation will be presented in this paper. 

One of the important results derived by Hansen and Hurwitz [1] was 
that selection of a p.s.u. from a stratum with p.p.s. was more efficient 
than selection with equal uu dd for a large class of populations. 
Their results were based on the betweefi p.s.u, components only, the 


539 • . E 


540 AMERICAN STATISTICAL ASSOCIATION JOURNAL, ‘SEPTEMBER 


within p.s.u. component being relatively small in all instances, Us g 
a county as a p.s.u., Jebe [4] showed that the within p.s.u. component 
was relatively large for many agricultural populations of N.C., consider- 
ing any reasonably practicable total sample size. He recommended the 
need for investigation of the township! as a p.s.u. This aspect of the 
sampling problem is also examined in this paper. " 

The values of four characteristics of the N.C. agricultural population 
have been studied. These are: 


1. Number of non-white operators 
2. Value of land and buildings 

8. Number of days worked off farms 
4. Total number of farms. 


"The sources of data were: 


(a) U. S. Census of Agriculture 1945, vol. 1 part 16 (North Carolina. 
and South Carolina), : 
(b) The 1945 Census of Agriculture for each minor civil division 
and the sample Census of Agriculture as available in I.B.M. 
punch cards, 


Fora general description of the population studied reference may 
made to [7]. Most of the notations and terms employed in this ра 
have been used by Jebe [4]. For others, reference may be made to 

The principal objectives of this investigation were: 

(i) to examine some applications of theory already developed 
the selection of one p.s.u. per stratum, ) 

(či) to develop new theory for the selection of two p.s.u.'s ре 
stratum, 

(iii) to compare these twe selection procedures empirically. 


Some theoretical comparisons ofthe two selection procedures werd 
made; however, no useful rules were found for indicating a preferene 
for either procedure. 


2. SAMPLING SYSTEMS WITH ONE P.S.U. SELECTED 

FROM EACH STRATUM | 

_ Three intensities of stratification were employed in this study, Vid 
197 strata, [6], 98 strata, and 40 strata. The stratification was based ou 
data provided by the 1940 Census of Agriculture. The counties of Date 
гапа Swain were omitted from the study as they had only a small num 
ber of farms, The state was first divided into 197 strata following m.c.d, 


1 In North Carolina the township is also referred to as а minor civil division (m.c.d.). 


> 7 » 


COMPARISON OF STRATIFIED TWO-STAGE SAMPLING SYSTEMS 541 


lines and then 98 by combining two adjacent old strata, except that two 
of the new strata each contained about two and a half of the old strata. 
In this division care was taken to construct, as nearly as practicable, — 
equal sized strata measured in terms of 1940 number of farms, Geo- 
graphic contiguity of the m.c.d.’s within a stratum was maintained. 
The 40 strata were formed by combining contiguous counties of N.C. 
These forty strata were used for sampling designs with the county and 
with the m.c.d. as the p.s.u. However, equality of size of the strata was 
not feasible in this case., 

The basic sampling design employed consists of two-stages of sam- 
pling in which 

(a) one p.s.u. (Le. a county or an m.c.d.) is selected from each 
stratum with equal probability or probability proportional to 
size, and 
a constant number of sub-sampling units (s.s.u.) (except in the 
40 strata design) is selected at random from the s.s.u.'s located 
in the open country area of the p.s.u. selected in (a). 


(b 


= 


The s.s.u.’s are area segments delineated for the Master Sample of 
Agriculture project [5]. In the 40 strata design the total number of 
8.8.u.’s specified for the state was allocated proportionally among 
strata, i.e., proportional to the total number ‘of s.s.u.'s in the open coun- 
try portion of each stratum in 1945. 

A summary of the various designs examined is given in Table 1. 
These designs are classified into five sampling systems A, B, 0, D and 
E. A sampling system consists of the sample design and the method of 
estimation. A notation for designating the sampling systems discussed 
in this paper has been adopted. For simplicity this notation is confined 
to a single stratum, as is the discussion te follow. If 2, is the function 
designating the selection probabilitie$ to be used and Y" is an estimator 
for the population total Y, for the characteristic of interest, then 
(0, Y") denotes the sampling system. For the two-stage sampling de- 
signs under consideration, where simple random sampling is always 
used in the second stage, 9, has been confined to the probabilities used 
for selecting the primary sampling units. To illustrate, ifa single p.s.u. 
(вау the ith) is selected and subsampled, let Y; be an unbiased esti- 
mate of Y; its population total. Further suppose W: is a weight fune- 
tion associated with Y; such that W.Y/ is an estimator for У, the stra- 
tum total. If the p.s.u. is selected with probability proportional to X, 
then the sampling system is [X,/X, WsY/]. The sarüpling systems 
4, B, C, D, and E are as follows: . 


542 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEM 


A. (i) p.s.u. selection: р.р.в. with the number of farms in 1940 (Fw 
measure of size. 
(ii) s.8.u. selection: designs 3, 4, 9, 10, 15, and 16 in Table 1. ч 
(701) estimation: ratio to the value of the characteristic in 1940 (Xj) | 
p.s.u. sampled. 
(iv) sampling system (biased): 
Fi 77 Y,’ 
FI x] б Р. 
B. (i) p.s.u. selection: equal probability for each of the N p.s.u.'s in the str 
(ii) s.s.u. selection: designs 5, 6, 11, 12, 17, and 18 in Table 1. E 
(iii) estimation: same as in A. 
(iv) sampling system (biased): 


1 Y,’ 
5 Б 2 х] i 
C. (i) p.s.u. selection: as in A. 
(it) s.5.u. selection: designs 3, 4, 9, 10, 15, and 16 in Table 1. 


(tii) estimation: ratio to the number of farms in 1940 for the p.s.u. 
(iv) sampling system: 


Fs ГЫ 

PUE. nj ; : 

D, (i) p.s.u. selection: p.p.s. with the value of the characteristic in 1940 
measure of size, 

(it) s.5.u. selection: desigas 1, 2, 7, 8, 13, and 14 in Table 1. А 

(tit) estimation: ratio to the value of the characteristic in 1940 for the] 

sampled. „ 1 

(iv) sampling system: 


E 


ax Y; 

E. (i) p.s.u. selection: equal probability. 
(či) s.s.u. selection: designs 5, 6, 11, 12, 17, and 18 in Table 1. 
(iit) estimation: estimated p.s.u. total weighted by the number of p.8 
the stratum. + І 
(iv) sampling system: 


Xi Y) x]. 


x Nye]. 


Tt should be clear that the method of selectirg the p.s.u. re 
same for each of the four characteristics observed when samp! 
tems A, B, С, or E are used. This is not the case for system D, in whic! 
the probability of selection depends on the value of the charactetl 
in 1940 for the p.s.u. sampled. Systems A, B, C, and E, therefore, 
be recommerided for a general purpose survey, i.e. where the pu 
of the survey is to estimate the totals of severe] characteristics. 
tem D, however, could be recommended only where information 


» > 


COMPARISON OF STRATIFIED TWO-STAGE SAMPLING SYSTEMS 543 


sired on only one characteristic or where additional characteristics ob- 
served must be subordinated in favor of a single characteristic. 
Expressions for mean square errors, i.e. the between and within com- 
ponents of variance and the bias terms, for the various sampling sys- 
TABLE 1 
SAMPLING DESIGNS FOR ONE P.S.U. PER STRATUM 


No. of s.s.u.'s 


Design Method of selection of p.s.u. per selected Pamplin 
No. Rate % 
p.8.u. 
197 Sirata 
1 P.P.S.-Value of characteristic in 1940 2 1 
2 P.P.S.-Value of characteristic in 1940 4 2 
3  P.P.S.-Number of farms in 1940 2 1 
4  P.P.S.-Number of farms in 1940 4 2 
5 Equal probability 2 1 
6 Equal probability 4 2 
98 Strata 
7  P.P.S.-Value of characteristic in 1940 4 1 
8  P.P.S.-Value of characteristic in 1940 8 2 
9 P.P.S.-Number of farms in 1940 4 1 
10 P.P.S.-Number of farms in 1940 . 8 2 
11 Equal probability 4 1 
12 Equal probability 8 2 
40 Strata* З 
13 P.P.S.-Value of characteristic in 1940 ‘6 0.5 
14 P.P.S.-Value of characteristic in 1940 20 2 
15 P.P.S.-Number of farms in 1940 5 0.5 
16 P.P.S.-Number of farms in 1940 20 2 
17 Equal probability i 5 0.5 
18 Equal probability 24 20 2 


* No. s,5.u.'s per p. s. u. is an average figure. 


tems are given in the appendix. The general procedure for derivation 
Will be indicated here. Consider the sampling system (Qp, W;Y;') as 
illustrated above. It is easy to see that, in general, this system is biased 
for the estimation of Y, because 


E(W;Y/)- E.-EW;Y/) = E,(W.Y,), 


Where E, refers to expectation over the first stage of sampling and №, 
Over the second stage. This estimate is unbiased if, and only if, % 
=1/W,. In particular, let 9, pe equal to P; where ):P:=1. Thus 
"B(W,Y) = 22 Pe We Yi, : 
i 


544 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 104 | 


This sum equals Y if, and only if, P;=1/W,, for all i Hence, 
[1/W;, W:-Y;'] is an unbiased sampling system. A general expression 
for the mean square error for systems in the class considered is given by 


E[W.Y,' — Y] = EWY; — Y,)?] + E[W;Y; — E(V,Y)]: 
Within variance Between variance 
zi [E(W,Y ;) = Y] 0) 
Square of Bias. 


The mean square error for a particular system is obtained by substitut- 
ing the corresponding values of W; for the system in equation (1) above 
and taking the expectations according to the probability function, Qj 
e.g. for system А, W;— X/X; and 2,=F;/Fo. The mean square error 
for the state estimate is found by summing (1) over all strata, except 
that the bias is first summed over all strata and the total bias is 
squared, 


3. ANALYSIS OF VARIANCE COMPONENTS FOR SYSTEMS 
A, B, C, D, AND E 

In order to compare sampling systems A, B, C, D, and E the esti- 
mated (C.V.)*X 104, where (C.V.)? is the estimated mean square error 
divided by Үз, are presented in Tables 2 and 3. The within p.s.u. com- 
ponents are shown in Table 2, the total error in Table 3. The between 
components of error may be obtained by subtraction. This latter com- 
ponent includes both the between p.s.u. variance and the bias ĉon- 
tribution for systems A and B. In caleulating the between component 
contributions, exact expressions for the expected values have been ob- 
tained, Since information was available for only a one in eighteen вуз" 
tematie sample of the Master Sample segments within each county, 
considerable difficulty was experienced in obtaining estimates of the 
within p.s.u. component of the total error. Furthermore, this sample 
embraced incorporated, unincorporated and open country areas. Ш 
this connection Jessen [5, p. 536] says, “The areas into which the open 
country zone was partitioned serve as units for sampling either or both 
farms and persons whether farm or non-farm. This portion of the sam- 
ple is as useful, therefore, for a sample census of population as for à 
sample census of farms. This dual purpose sampling unit is feasible 
only in the open country, where the majority of the families are €n- 
gaged in farming.” Hence only data for the open country area were 
used in the estimation of within p.s. variances. The Master Sample 
segment summary cards which belonged to incorporated or unincor- 
porated places or in a few cases to open country areas falling within 


3 


_ COMPARISON OF STRATIFIED TWO-STAGE SAMPLING SYSTEMS 545 


aetropolitan districts, or which formed subunits of multiple units had 
to be dropped. As the open country area included about 90 per cent of 
e total number of segments for North Carolina, the within p.s.u. 
variation for this section of the state is fairly representative of the 
entire state. Strictly speaking, however, the conclusions drawn are valid 


TABLE 2 


ESTIMATED (С.У) х1о‹ FOR THE WITHIN Р.8.0. COMPO- 
, NENT OF ERROR 


BAMPLING SYSTEM* At Bt с р E 
* ÜHARACTERISTIO 
Sampling Rate (70) 1 2 1 2 1 2 1 2 1 2 
197 Strata (т.е) 
No, of Non-White Operators us 5 ш в 4 2902 5 230 4 зш 
alu of Land and Buildings ш 77 11 5 8 1714 2 80 УДА a E pa 
No, of Daya Worked off Farms 129 60 м 72 16 в з Hh 1! 8 
"Total No. of Farms 13 77 ов оТ МБЕ ДАР СЕДАН 
98 strata (тй) 
No, of Non-White Operators m 5 ш 6 49 n» 5 3 ^" и 
Value of Land and Buildings mo 754897 8.711. е ee, 
LO NeofDas Worked off Fame 166 71 19 в 5 Т и. 5 n et 
Total No. of Farms 14 6 15 7 и в и 6 16 8 
Sampling Rate (76) °з зз. 085 -/9 10.57.019 3105. 9 5:08:18 
40 Strata (m.c.d.) r; 
Ко, of Non-White Operators 200 30 35 38 70 Mw м ш 8 M 
Value of Land and Buildings 3 54 а 195 000 Гаа ЫИ ЫЫ 
No.of Days Worked off Farms 681 83 835 90 30 4 54 8 з 5 
Total No, of Farma 27 4 28 4 27 4 27 4 30 5 
М 40 Strata (County) 
No, of Non-White Operators 145 35 152 35 т 33 106 25 15 
Value of Land and Buildings 4,3 3 49/50/9889, A 98/19 hei 
No. of Days Worked off Farms сс 16 55 71 gf 17 4 10 з nd 4 n 
o Total No. of Farms 36 9 37 9 9 36 9 4 10 


* Bee Section 2 for definitions of sampling systems. 
T The bias contribution to the total error is included in the between рал. (СУА. 


- for the open country area only. It was further assumed that the estimate 
‘of the within county and m.c.d. variation obtained from systematic 
Sampling is approximately equal to that of a random sample if an equal 

| Dumber of segments are selected from the same population. ^ 

: а Тһе within, p.s.u. component required the estimation of both within 

ть County and within m.c.d. variation. For а few of the counties and for 

_ а great many more of the m.g.d.'s, the number of s.s.u.'s available for 

estimating the within variation was too small to provide ‘efficient 

estimates. Furthermore, of the 941 m.c.d.’s used in this investigation, 


546 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1% 


342 provided no estimates of the within variation, since data were avail 
able in each of these for either none or only one s.s.u. ; 
The method of estimation used consisted of pooling the obsei 

within p.s.u. variances for contiguous p.s.u.’s so that the resul 
estimates were based on more degrees of freedom, thus increasing their 


TABLE 3 
ESTIMATED (C.V)!x10* FOR THE TOTAL COMPONENT 
OF ERROR 
SAMPLING SYSTEM* At Bt с р E 
CHARACTERISTIC 
Sampling Rate (%) 1 2 1 2 135 1 ee 
197 Strata (тай) 
Мо. of Non-White Operators 137 74 1% 90 52 2 а з 6 0 
Value of Land and Buildings аз и 6 © >ш в n s И 
No. of Days Worked off Farms — 681 — 612 10 008 з 55 в м 76 
Total No. of Farms 17 Wr Wie We ат т 9 и газу 
98 Strata (m.c.d.) 
No. of Non-White Operators 161 87 9». 12 656 а б з 8 8 
Value of Land and Buildings 3 9 и и 7 23 з 1 4 8 
No. of Days Worked off Farms 118 1074 105 1204 133 194 122 108 166 157 
Total No. of Farms ош алва эи юн s s 
Sampling Rate (%) 95 ©з 05 2 05 2 оз 2 ов MS 
? 40 Strata (m.c.d.) 
No. of Non-White Operators 91 12 49 ш 120 в ш ы 20 
Value of Land and Buildings 62 35 70 39 в в з 2% ш 8 
No. of Days Worked off Farms 4601 4004 5285 4500 386 3601 345 299 505 ATT 
Total No. of Farms T оеп 4 и 4 п юп 8 @ 
40 Strata (County) 
No. of Non-White Operators ш — 39 10 44 12 з 0 28 1% 10 
Value of Land and Bulldings 2 — aw c4 и 5 њо nn в 0 
No. of Days Worked off Farms — 114 6 — 16: — 108 152 121 9% 00 201 16 
Total No. of Farms з 10 и вв з 00 39 1 е 8 


* See Section 2 for definitions of sampling systems, 4 
1 The bias contribution to the total error is included In the between p.s.u. (С.Ү.)?, Ў 


stability. This method assumes, of course, that the true within p.s. 
variances do not vary for those p.s.u.’s which were combined. Sin 
this assumption is not generally valid, the estimates obtained of 
within variation may be slightly biased, thus affecting any compariso! 
amongst two or more sampling systems. This point needs further in: 
vestigation. Three sets of pooled estimates of within p.s.u. varian 
were worked out. 4 


(8) "The 40 strata with the county as the p.s.u. were grouped into 


> 


э р y 


COMPARISON OF STRATIFIED TWO-STAGE SAMPLING SYSTEMS 547 


strata,‘ each new stratum consisting of two contiguous old strata. 
The within county variances for each new stratum were pooled 
to yield an over-all estimate of variance for the stratum, This 
estimate was used for each county in the stratum. 
(b) For each of the 20 strata obtained in (a) the within m.c.d, vari- 
ances were pooled to yield another estimate of the within p.s.u. 
variance for each stratum when the designs using 40 strata with 
the m.c.d. as the p.s.u. were studied. 
The 98 strata with the m.c.d. as the p.s.u. were pooled into 20 
strata such that the stratification was almost the same as in (a) 
and (b). The within т.е... variances within each new stratum 
were pooled to yield an estimate of within p.s.u. variance for the 
stratum. This was applied to each m.c.d. within each stratum, 
when the designs using 98 and 197 strata were studied. 


(с 


2 


The within county contributions to total error are considerably 
greater than the between contributions for all the sampling systems. 
Even with the m.c.d. as the p.s.u., this contribution is a very important 
factor. As pointed out in the introduction, this fact was also observed 
by Jebe [4]. Hence it might be feasible to consider a delineation of 
s.s.u.’s which are more homogeneous than the present Master Sample 
segments. 5 

It can be seen from Table 3 that the total (C.V.)? for all the char- 
acteristics and for all the sampling systems using 40 gtrata is less where 
the p.s.u. is a county than when the p.s.u. is an m.c.d. This difference 
i$ marked for number of days worked off farms for all the sampling 
systems, particularly A and B. One reason for this marked difference 
is the smaller within contribution when the p.s.u. is а county. This 
seemingly anomalous result arises from jhe instability of the weights 
(W,— X/X.) for number of days wgfked off farm (and also for num- 
ber of non-white operators) when the p.su. is an m.c.d. In many 
m.c.d.'s, X; is very small even though the probability of selection 
(Fo:/Fo) is large; since the within component is E[W (Y: – Ү)?), а 
very small value of X; (large value of W;) can have a tremendous ef- 
fect on the within contribution. The WW, are much more stable when the 
psu. isa county. ~ 

Six main comparisons of sampling systems have been made on the 
basis of the relative errors shown in Tables 2 and 3. These comparisons 


can be divided into two groups. 
"s from a 


Ф 
‘These will be described below in the discussion on the selection of two or more р.в. 
stratum, з Ж 

E е 


548 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


(a) Comparisons of sampling systems differing in the method of selecti 
It would appear from the tables that the biased system A, where 
selection is p.p.s. to number of farms in 1940 but the estimator is the 
ratio to the value of the characteristic in 1940, is more efficient for all 
the characteristics and for all stratifications than the biased system B, 
where selection is made with equal probability with the same estimator, 
The gain in efficiency is, however, very small when 40 counties are — 
used for estimating the number of non-white operators and value of Я 
land and buildings. This does not mean that selection with equal | 
probability is almost as effective as p.p.s. selection, since the appropri- - 
ate comparison is on only the between p.s.u. components of variance, - 
For the between p.s.u. components of variance, system A showed con- | pr 
siderable gain in efficiency relative to system B. System D, where ( 
selection is made with p.p.s. to value of the characteristic in 1940, is 
more efficient than system A or B for all characteristic totals, except _ 
for total number of farms where systems A and D are identical. The 
relative efficiency of system D is highly pronounced for estimating the — 
number of non-white operators and number of days worked off farms 
when the m.c.d. is used as the p.s.u. q 
(b) Comparisons of sampling systems differing in the method of estima- _ 
tion : 
The unbiased system C, in which selection is p.p.s. to number of | 
farms in 1940 and the estimator is ratio to number of farms in 1940, is — 
generally more efficient than the biased system A, when the m.c.d. is 
used as the p.s.u, The gain in efficiency is most pronounced for number 
of days worked off farms and is identically unity for total number of — 
farms. With the county as the p.s.u., the relative efficiency of C to A — 
is considerably reduced and is in fact less than unity for value of land 
and buildings and number or dzys worked off farms. ) 
The unbiased system E, where selection is with equal probability and 
estimation is accomplished by a simple expansion of the estimated 
p.s.u. total by the number of p.s.u.’s in the stratum, is generally more 
efficient than the biased system B for estimating the number of non- 
white operators and the number of days worked off farms using the | 
m.c.d. as the p.s.u. However, the situation is reversed for value of land 
and buildings and total number of farms, for which the correlations и 
between the 1940 and 1945 values are both high. When the county is 
used as the p.s.u., system B is more efficient than system E for esti- 
mating totals of all the characteristics. 4 
As regards the between components of variance, the county is alwa; 
a better p.s.u. for all the characteristics for biased systems A and B. 
compared to unbiased systems C and E. The reduction in the sampling 


> 


\ 


COMPARISON OF STRATIFIED TWO-STAGE SAMPLING SYSTEMS 549 


error portion of this component for the biased systems A and B, due 
to the high correlations between the 1940 and 1945 values of each of 
the characteristics with the county as a p.s.u., more than compensated 
for the loss due to the bias contribution. 

Because of the high C.V. for number of days worked off farms, it is 
very difficult to find a sampling system which will give acceptable re- 
sults for all four characteristics. 

Any one characteristic total can be estimated satisfactorily by use 
of system D for that characteristic; however, this procedure will not 
give satisfactory results for the other three characteristics. Suppose the 
standard is set up that, using a two per cent sample, each characteristic 
total shall be estimated with no more than a ten per cent C.V., i.e. an 
accuracy of estimation to within 20 per cent of the item total with 95 


“per cent confidence. None of the sampling systems will provide this 


accuracy for all types of stratification and p.s.u. considered here, How- 
ever if 40 strata with the county as the p.s.u. are used, system A will 
meet the standard and system B will almost meet it. If 197 strata are 
used, with a two per cent sampling rate, systems C and E will provide 
estimates of the characteristic totals within an eight per cent C.V. 
None of the systems considered will provide an estimate of the num- 
ber of days worked off farms within a five per cent C.V. Hence, it was 
deemed advisable to investigate the possibility of sampling from each 
of the 98 counties (this would be single stage sampling). If all 98 coun- 
ties were sampled, the between component of the total error would van- 
ish; hence, the total error would be simply the within component. 
This within component, using the 98 counties as strata, was determined 
easily from the calculations of the within component for system E for 
the 40 strata with the county as the p.s.u. In order to use the existing 
calculations, it is noted that, except for weighting factors, the within 
component of a 3 per cent sample ying 40 counties corresponds to the 
total error for about a 1.25 per cent sample using all 98 counties 


` (actually about 5 s.s.u.’s per county) and a 2 per cent sample of 40 coun- 


ties corresponds to a 5 per cent sample of all 98 counties. The estimated 
per cent C.V. using the 98 counties as strata are presented below. 


S н Sampling Rate 
Characteristic 


1.25% 5% 


Number of Non-White Operators 7 
Value of Land and Buildings 4. 
Number of Days Workad'off Farms 4. 
Total Number of Farms 4. 


550 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


From these results it appears that a sampling rate of 2.5 per cent 
would be sufficient to estimate each characteristic within a five per 
cent С.У. if all 98 counties were included in the sample. 


4. SELECTION OF TWO PRIMARY SAMPLING UNITS FROM A STRATUM 


Only one intensity of stratification was employed in this study. The 
state was divided into 20 strata. Each stratum was formed by combin- 
ing two contiguous strata of the 40 strata with the county as the p.s.u. 
discussed under selection of one p.s.u. from a stratum. Four sampling 
systems F, G, H, and K were considered for each of the four char- 
acteristics. These are: 


Е. (i) p.s.u. selection: the first with p.p.s. to the value of the characteristic in 
1940 and the other with equal probability but without replacement from 
the remaining p.s.u.'s in the stratum. 

(ii) s.s.u. selection: random and independently from each of the p.s.u.'s 
selected in (i) above. The number of s.s.u.'s selected from each of the 20 
strata was proportional to the total number of s.s.u.'s in the open 
country area of the stratum. Two subsampling rates were used, i.e. 0.5 
and 2.0 per cent. 

(iti) estimation: ratio of the estimated total of the characteristic to the total 
value of the characteristic in 1940 for the p.s.u.’s selected in (i). 

(iv) sampling system: 


X; + Xi Y’ + Y; x] 
(W-DX Z-4X V 


G. (i) p.p.s. selection: the first with p.p.s. to the number of farms in 1940 and 
the other with equal probability but without replacement from the re- 
maining p.s.u.'s in the stratum. 

(it) s.s.u. selection: same as in F (ii). 

(iii) estimation: ratio of the estimated total of the characteristic to the total 
number of farms in 1940 for the p.s.u.’s selected in G (1). 

(iv) sampling system: > 


AS 
Pu Еу Yat + Yj 
(N —DF. Fu + Fo o]: 


Н. (i) p.s.u. selection: the two p.s.u.'s each with p.p.s. to the value of the char- 
acteristic in 1940 but with replacement. 
(ii) s.s.u. selection: same as in F (ii). 
(iii) estimation: average of the ratios of the estimated totals of the character- 
We to the corresponding value of the characteristic in 1940 for the p.s.u.'s 
selected. 


(iv) sampling system: 


EST ( T + b EX 4 

R Хай N ES zl "TN 

К. (i) p.s.u. selection: two p.s.u.’s each with p.p.s. to the total number of farms 
in 1940 but with replacement. a 


D 


COMPARISON OF STRATIFIED TWO-STAGE SAMPLING SYSTEMS 551 


(ii) s.s.u. selection: same as in F (ii). 

(iit) estimation: average of the ratios of the estimated total of the char- 
acteristic to the total number of farms in 1940 for the p.s.u.’s selected. 

(iv) sampling system: 


2F oF oj Yi , Yj'\ Fo 

Б 
The method of selecting the p.s.u.’s for systems F and H depends on 
the value of the characteristic for these units in 1940. Therefore, these 
systems would be useful only for specific purpose surveys. On the other 
hand, systems G and K for which the method of selection of the p.s.u.'s 
is based on number of farms in 1940 would be suitable for general pur- 
pose surveys. Expressions of variances for each of the sampling systems 
described above are given in the Appendix; however, the procedure for 
arriving at an expression for the variance will be indicated for one of 
the systems. Consider the unbiased sampling system 

[EM Y’ +Y; ] 

"LOE Dx eae 


, , , , 2 
Var. [A=] a Ex aH d 
Xi; X; Xi X; 
(E me Y; t Y; x 
x Xi X; Xi X; 


ҮТ | \ 
s X; X; 7 r] 
1 (Y; + Y)? 
= Eimi e a (d 
(xr (N-1) (Xi; + Xj) j 
Between 
(Zi + Z)X 
+ {ZE пася zi 
Within 
where Z;- Mi(M;—m3jc/m.. 


5. ANALYSIS OF VARIANCE COMPONENTS FOR SYSTEMS F, G, H, Np К 


This section, will deal with the analysis of the variance components 
of the sampling systems described in Section 4. The results of the 
analysis for systems C and D, where only one р.8.0. is selected from a 
stratum, are also presented here to facilitate comparative study with 
systems in which two p.s.u.’s are selected from a stratum. In this study, 

e 


552 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


as before, the square of the coefficient of the variation, (C.V.)?, will be 
used for the total error. Calculations for both the between and within 
components of error have been made only for systems F and G, in 
which the selection of p.s.u.’s is made without replacement. For sys- 
tems H and K, in which the p.s.u.’s are selected with replacement, cal- 
culations have been made on the between p.s.u. components of the error 
only. 

From Tables 4 and 5 it would appear that systems F and G are, re- 
spectively, more efficient than systems D and C in estimating number 
of days worked off farms and are highly so on the between components 


TABLE 4 


BETWEEN COMPONENTS OF (C.V.)!X10* FOR SAMPLING 
SYSTEMS C, D, F, G, H, AND K* 


Sampling System G K с Е H D 
Measure of Size 1940 No. of Farms 1940 Value of Char. . 
CHARACTERISTIC 
No. of Non-White Operators 16 42 16 3 8 3 
Value of Land and Buildings 10 27 7 2 5 1 
No. of Days Worked off Farms 82 252 111 48 117 48 
"Total No. of Farms E 1 4 2 1 4 2 


* For definition of sampling systems C and D, see Section 2 and for F, G, H, and K, see Section 4. 
(C. V)! X10! rounded to the nearest integer but (C.V.)? calculated correct to the sixth decimal figure. 


TABLE 5 
TOTAL (C.V.!x10* FOR SAMPLING SYSTEMS С, D, F, AND G 


Measure of Size 1940 No. of Farms 1940 Value of Char. 
Sampling System Gy с Е р 
Sampling Rate 9% 0.5 2 0.5 2 05452 0.5. 2 

CHARACTERISTIC 
No. of Non-White Oper- 

ators 125 42 112 39 115 30 109 28 
Value of Land and Bldgs. 52 20 45 16 45 12 40 11 
No. of Days Worked off 

Farms 127 93 152 121 w, 92 55 96 60 
Total No. of Farms 41 11 38 10.5 41 11 38 10.5 


of variance. With an increase in the sub-sampling rate from 0.5 to 2.0 
per cent, there is an appreciable reduction in the total error. Systems 
F and G have’an additional advantage | over systems D and C, respec- 
tively, in that they would provide estimates of the.between components 
of error from the sample. On the between components of variance, вув- 


» 
» 


SON OF STRATIFIED TWO-STAGE SAMPLING SYSTEMS 553 


ms F and G are, respectively, more efficient than systems D and C 
ating the total number of farms. This seems to be a paradox, 

е the between p.s.u. (C.V.)? for any of 20 strata for systems F and 
‘expected to include in addition some between strata variance (of 
he original 40 strata). The method of estimation introduced in estimat- 
the population total by the selection of two p.s.u.'s seems to cut 
the variance of the estimated total, because the ratio (Y;'-- Y j) 
(X:+X;) may be less variable than Y ;'/X;. This reduction is naturally 
pronounced where the between p.s.u. contribution is high, i.e., 
number of days worked off farms for which the reduction in vari- 
e due to the method of estimation more than compensates for the 
crease in variance due to the inclusion of some between strata varia- 


between components of the total error for systems F and б, 
re selection is made without replacement, are highly efficient com- 
to the same components for systems H and K where selection is 
$ e with replacement. 

With a two per cent sample, it is estimated that system G would 
rovide an estimate of each characteristic total within a ten per cent 
апа within a seven per cent С.У. for all characteristics excepting 
aber of days worked off farms. For a specific purpose survey, ву8- 

F should be used. y 


6. CONCLUSIONS =; 


Of the four characteristics investigated in this paper, one was іп- 
luenced greatly by war-time activities between 1940 and 1945. None of 
sampling systems studied was suitable to estimate this item, num- 
of days worked off farms, with seven per cent accuracy when а 
Ipling rate of two per cent is user: Tf a stratified sample using the 
8 counties as strata is substituted Tor & two-stage sampling system, a 
1.25 per cent sampling rate would be sufficient to estimate this item 
ith a five per cent C.V. The conclusions which follow relate to the 
her three characteristics only, namely number of non-white operators, 
ue of land and buildings and total number of farms. _ 
For a general purpose survey to estimate characteristic totals, sys- 
Cis recommended. This system will provide estimates for all the 
es of stratification considered with no more than a nine per cent 
if a two.per cent sampling rate is used. à NE 
ong the four types of stratification and p.s.u. considered in this 
ludy, two of them, namely, 4Q strata using the county or 197 strata 
ng the m.c.d. asthe p.s.u., were fourid to be most suitable from the 
ht of view of total error and are recommended for estimating the 
° 


554 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


characteristic totals. With a two per cent sampling rate using system C 
and either of these types of stratification, estimates of the characteristic 
totals could be obtained within a C.V. of six per cent. 

The within p.s.u. component of error is not reduced to the same ex- 
tent as the between p.s.u. component is increased, when the m.c.d. is 
used as the p.s.u. instead of the county with 40 strata. Even with the 
m.c.d. as the p.s.u., the within contribution to the total error is a very 
important factor for any reasonably practicable sample size. 

The efficiencies of p.p.s. and equal probability selection are compared 
in systems A and B. The method of estimation, ratio to the value of the 
characteristic in 1940, is the same in both systems. For all cases investi- 
gated, system A (p.p.s. selection of p.s.u.’s with number of farms in 
1940 as the measure of size) was found to be superior to system B (selec- 
tion of p.s.u.'s with equal probability). 

Although system D, in which selection is p.p.s. to the 1940 value of the 
characteristic, is impracticable when it is desired to estimate the totals 
of a number of characteristics from a single sample, it is generally suit- 
able for specific purpose surveys, where one is interested in only a single 
characteristic. However, system C was found to be nearly as efficient 
as system D in estimating totals for the characteristics studied in this 
paper. This may not be true in general. 

The choice of a sampling system has been based on the assumption 
of equal costs per schedule. There is a real need for using a more real- 
istic cost function; but this would necessitate acquiring adequate and 
accurate information on the various cost factors, e.g., cost of travel 
and cost of enumeration in survey designs. 

The principal contribution in this paper is the examination of & 
sampling design for the selection from a stratum of two p.s.u.'s, the 
first with p.p.s. and the secohdywith equal probability from the re- 
maining p.s.u.'s. The advantage of this design over one involving selec- 
tion of a single p.s.u. from each stratum is that it permits an unbiased 
estimate [11] of the sampling error from sample data. In some cases, in 
addition, the system involving the selection of two p.s.u.'s per stratum 
may be more efficient than systems permitting the selection of only 
one p.s.u. from each of twice as many strata. It appears that this will 
be true when there is extreme variability among the p.s.u.'s, such as in 
the case of number of days worked off farms. The between p.s.u. vari- 
ance for the three remaining items studied is much lower, but even in 

these cases the efficiencies of the systems selecting one p.s.u. per stra- 
tum do not greatly exceed those in which two p.s.u.’s per stratum are 
selected without replacement. ^ » 

A considerable part of the material in this paper is taken from the 


- 


COMPARISON OF STRATIFIED TWO-STAGE SAMPLING SYSTEMS 555 


senior author’s thesis submitted in partial fulfilment of the require- 
ments of the Ph.D. degree in Experimental Statistics, North Carolina 
State College, 1952. Correspondence with E. H. Jebe of Iowa State 
College and Morris Hansen of the Bureau of Census helped in the gen- 
eral orientation and understanding of certain aspects of the problem. 
The computational work involved in this study was enormous and 
would have remained incomplete but for the special operations devised 
by members of the I.B.M. staff. The authors wish to express their ap- 
preciation to D. G. Horvitz for offering many valuable suggestions in 
the final preparation of this paper. 


REFERENCES 


[1] Hansen, Morris H., and Hurwitz, William N., *On the Theory of Sampling 
from Finite Populations,” Annals of Mathematical Statistics, 14 (1943), 
332-62. 

[2] Hansen, Morris H., and Hurwitz, William N., “On the Determination of 
Optimum Probabilities in Sampling,” Annals of Mathematical Statistics, 20 

_ (1949), 426-32. 

[3] Horvitz, D. G., and Thompson, D. J., “A Generalization of Sampling With- 
out Replacement from a Finite Universe," Journal of the American Sta- 
tistical Association, 47 (1952), 663-85. 

[4] Jebe, Emil H., “Estimation for Sub-sampling Designs Employing the 
County as a Primary Sampling Unit," Journal of the American Statistical 
Association, 47 (1952), 49-70. * 

[bl Jessen, Raymond J., “The Master Sample Project and Its Uses in Agricul- 
tural Economics,” Journal of Farm Economics, 29 (1947), 531-40. 

[6] Kastenbaum, Marvin A., “On Sampling for State Estimates of the Farm 
Population in North Carolina Using the Township as a Primary Sampling 
Unit,” Institute of Statistics Mimeograph Series, No. 29. North Carolina 
State College, Raleigh, N. C. (1950). 

17] Madow, Lillian H., “On the Use of the County as the Primary Sampling 
Unit for State Estimates,” Journal of the American Statistical Association, 
45 (1950), 30-47. Є 

[8] Midzuno, Hiroshi, “An Outline of the Theory of Sampling Systems,” Annals 
of the Institute of Statistical Mathematics (Japan), (1950), 149-56. 

[9] Narain, R. D., “On Sampling Without Replacement with Varying Prob- 
abilities,” Journal of the Indian Society of Agricultural Statistics, 3 (1951), 
169-74. 

[10] Sen, Amode Ranjan, “Present Status of Probability Sampling and its Use in 
the Estimation of Farm Characteristics," Paper presented at the joint 
meeting of the Econometric Society with the Institute of Mathematical 
Statistics, Minneapolis, Minn., Sept. 1951, [Abstract in Econometrica, 20 
(1952), 108.] 

[11] Sen, Amode Ranjan, “Further Developments of the Theory and Applica- 
tion of the Selection of Primary Sampling Units, with Special Reference to 
the North Carolina Agricultural Population,” Ph.D. Thesis, Library, North 
Carolina State College. Р ; 


556 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1054 


APPENDIX 

The expressions given below are the mean square errors for the esti- 
mates for each of the nine sampling systems presented in this paper 
(Section 2 and Section 4). As before A, B, C, D, and Е stand for the 
systems in which one p.s.u. is selected from a stratum and F, G, H, 
and K for the systems in which two p.s.u.'s are selected from a stratum. 
For simplicity of notation only the results for a single stratum are 
presented. These results when summed over all strata will provide the 
mean square error of the estimate for the entire state, noting that the 
bias is first summed over all strata and this sum is squared. 
Sampling system A: 


F 1 2 
т. xf X L— MM. m) A 
i 0 mi 
Within p.s.u. variance 
Fo Y; 1 Fo; Y; d 
Ex EIER ll 
X Fo LX; г Е, Х; 
Between p.s.u. variance 
н Fo; Y; Y7 
X: ола нн А 
CP Ex xx 
(Bias)? 
where c;* = variance for single s.s.u.’s selected at random within the ith 


p.s.u. 
Sampling system B: 


Within variance 


-niz xb 7 x] 
| 


Between variance 


[ из A zh 
ТОЙУ гә SIS 


” (Bias)? 


Prine Se 


С COMPARISON OF STRATIFIED TWO-STAGE SAMPLING SYSTEMS 557 


where o;? is defined as for V4. 
Sampling system С: 


Vo= AX ут ММ, т) I + In 2-0}, 


i о 4 0 
Within variance Between variance 


where с;? is defined as for Va. 
Sampling system D: 


1 {2 Y, 2 
Vp = x{ У —-M(M; — m) z ds bz =- n. 
6 Xi Mi A 

Within variance Between variance 


where т? is defined as for V4. 
Sampling system E: . 


1 
E , Nye] ` 
2 
Ve = {vx MM; — т) =} + {уме = r} ў 
i 4 Li 
Within variance Between variance 


Where c;? is defined as for Va. $ 
Sampling system F: 


ES X; J (= 3n x] 
(y = 1)х NXictX; 
(2:+7)Х 
ve= {> У; x (N—0)QG-T n 
Ted variance 
Li (ET YM yey: 
+ {ср We) Qu X) P 


* Between variance 


558 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


where Z;— M,(M;—m;)e;?/m; and c;? is defined as for V4. 
Sampling system G: 


[ Fo; + Fo; (= X n] 
(Nakata ils) S. 
Пе (2: + Z;)Fo } 
T { LL (У = 1)(Fo: + Fo) 
Within variance j 
1 (Y; + Y)? } 
тҮ, 
RU cS N=) Put Р) ° 
Between variance 


where Z;, Z; are defined as for Vr. 
Sampling system H: 


Е: Qu. B) J 
FANS CRISE 0 


reo lem me 


X, 4 mi 
Within variance 


«re Ey}. 


Between variance 


where c;? is defined as in V4. 
Sampling system K: 


amie aA 
po Glee es) 


Po m; 
Within variance ° 


teza 


Between variance 


Os 


where c;? is defined as in V4.» 


> 


COMBINING INDEPENDENT TESTS OF SIGNIFICANCE* 


ALLAN BIRNBAUM 
Columbia University 
It is shown that no single method of combining independent 

tests of significance is optimal in general, and hence that the 
kinds of tests to be combined should be considered in selecting 
a method of combination. A number of proposed methods of 
combination are applied to a particular common testing prob- 
lem. It is shown that for such problems Fisher's method and 
a method proposed by Tippett have an optimal property. 


1, THE PROBLEM AND SOME PROPOSED SOLUTIONS 


HE problem of combining independent tests of significance has been 

discussed and illustrated by a number of writers, including Fisher 
[2], Karl Pearson (cf. [4]), Wallis [7], and E. S. Pearson [4], to which the 
reader is referred for general discussions to supplement the present 
brief section. The formal statistical problem may be stated as follows: 
A hypothesis Ho is to be tested. An observed value f of а statistic 
has been obtained; the best test of Ho based on this statistic would 
indicate rejection at the ur significance level. That is, u, is the *prob- 
ability level” corresponding to the observed value 4; for example, if 
large values of the statistic are critical for Ho, then ш is the probability 
that a value as large as or larger than that observed"will occur under 
Hy. Similarly, independent values of statistics, ts * * * , I, have been ob- 
tained, and in the respective best tests of Ho based on these statistics 
the corresponding “probability levels” are из, * * * ; Ur. The essential 
requirement of independence of the ts will be satisfied if each 1; is 
based on a separate and indepen a data; if each t; is based on 
the same set of observations, the ¢,’s' must be known to be statistically 
independent functions of the observations. 

The problem of “combining these independent tests of significance” 
then is the problem of giving a test of H» on the basis of a set of ob- 
served values (probability levels) ш, Ua, . - -, We The test is not to 
Utilize the observed values їй, £s, . - - , tz; in general, it is assumed that 

(a) either these values or else the forms of the distributions of 
5,6, ..., i are unknown to the statistician confronted with the pres- 


* Work sponsored by the Office of Naval Research. в 3 
The writer is ad to Professor Henry Беһейё for helpful comments on the first draft of this 
Paper, Responsibility for any remaining defifiencies is the writer's. 


Y 559 


560 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


(b) this information is available but the distributions are such that 
there is no known or reasonably convenient method available for con- 
structing a single appropriate test of Ho based on (t, &, . . . , 1). 

Procedures for combining independent significance tests may be of 
practical use even in some situations in which the statistician has com- 
plete freedom to determine the design of a complex experiment. Sup- 
pose, for example, that a scientific hypothesis asserts that a change in 
the value of an independent experimental variable will alter the dis- 
tribution of one or more of k observable variables й, &, . . . , £j. For 
example, an hypothesis to be tested may assert that administration to 
subjects of a certain drug will have at least one of the three effects: 

(a) an inerease in the mean of a certain measurable physiological 
quantity, 

(b) an increase in the variance (within a subject) of a second meas- 
urable physiological quantity, and 

(с) a decrease in the probability of a subjects correctly making a 
certain sensory discrimination. 

Suppose that optimal tests for each of these effects separately could 
be based respectively on statistics &, t, and £4. In such situations con- 
struction of a single optimal test for the presence of one or more of the 
effects may be difficult or impossible. However, combining statistically 
independent tests based on й, t, and t, a single test at a desired signifi- 
cance level can be given. With appropriate design, this test will also 
meet given requirements of power to detect one or more of the three 
effects. Tt is shown in Section 3 that for some problems such a test will 
even have certain efficiency properties. 

To avoid technical complications not of direct interest here, let us 
assume that the £s have continuous distributions (densities). (Sce [7] 
for a discussion of the important discrete case.) Since u; is the prob- 
ability when Hy is true of observing a value of our ith statistic at least 
as large as t, we may write 


(1) Ui = Unti) = fon. 


where p,(t;) is the probability density function, of t; under Ho. Then 
the probability that u; lies in any interval, say u'su,Xu'', equals 
wu —w', or in other words u; has a uniform distribution on the unit 
interval under Ho with density 


1, 0<u, S51, 


(2) . feu) - = aes 


COMBINING INDEPENDENT TESTS OF SIGNIFICANCE ~ 561 


for each 7, and the игв are mutually independent. à 

Each method of combining tests is a rule prescribing that Hy should 
be rejected whenever the set of values (wu; *- ~, ux) falls in a certain 
critical region. Intuitively speaking, small values of the us are indica- 
tive of rejection; to discuss satisfactorily the problem of construeting 
a critical region of values of (u;, - - + , ux), we must consider the possible 
distributions of the ив when Hp is false. We shall assume here that 
whenever'a u; has a ton-uniform distribution, it is distributed on the 
unit interval according to some (unknown) density function gi(u;) 
which is non-increasing.!* 

Depending on the nature of the experimental situations in which the 
us are obtained, the appropriate alternative hypothesis would be 
either: 


На: All of the us have the same (unknown) non-uniform, non- 
increasing density g(u). 
or: 
Hs: One or more of the u;’s have (unknown) non-uniform densities 
gius). 


Under H4, the t;’s are statistics of the same kind obtained from k 
replications of an experiment, in which the underlying conditions are 
assumed to remain constant with H false. Under Hs, the ts may be 
Statistics of different kinds (for example, a normal mean and a normal 
Variance), and the conditions under which the 4/s are obtained need 
not be the same; it is assumed only that Hp is false in the case of at 
least one of the £s. H is seen to be a special case of Н». Probably in 
the majority of applications, Hg is the appropriate alternative hy- 
pothesis,? 


„> This assumption is not a strong one for our purposes: Suppose largo values of the statistio £ are 
critical for testing He against Н, and the probability@tensities of t under Hy and Hs are p(t) and р'(0, 
Fespectively, Then the definition of the statistic u is 


9 u-u- ji 
t 


80 du/dt = — (0. If the probability density of tis p/() then that of u is 
i 200) = p'O/ldu/at] = 2/0/20. 
Hence gfu) will be a non-increasing function of v if and only if p (0/p(0) is а non-decreasing function of 


encountered in applied statistics, 
id all other distributions of 


° 


562 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 | 


Some of the methods which have been proposed for combining inde- 
pendent tests of significance (i.e., for constructing critical regions of 
values of (ш, Us * - +, Us) are the following: a 

(1) Fisher's [2] method: reject Ho if and ойу if шиг" · +: 7 Uk Ee, 
where c is a predetermined constant corresponding to the desired sig- 
nificance level. Wallis, on pp. 231-34 of [7], discusses in detail Fisher's 
method of appropriately determining c. It turns out that —2 log 
UU + + + "ux is distributed as chi-square with 2k degrees of freedom when 
Hy is true. If d is such that 


(5) Prob (x,? Z d] =a 


where 1—a is the desired significance level, then setting —2 log c—d, 
we obtain c —e-4!?, 
(2) Karl Pearson’s method: reject Ho if and only if (1—1)(1—w) 
‚ (1—ux) £e, where c is a predetermined constant corresponding — 
to the desired significance level. In applications, c can be computed by: 

a direct adaptation of the method used to calculate the c used in 
Fisher's method. : 
(3) Wilkinson's [8] methods: reject Ho if and only if и; c for r or 
more of the us, where r is a predetermined integer, 1€r Sk, and cis _ 
a predetermined constant corresponding to the desired significance 
level. The k possible choices of r give k different procedures which we 
shall refer to as case 1 (r=1), case 2 (r—2), ete. For example, if k=2 
and a test at thé .95 significance level is desired, the case 1 procedure 
is: reject Hy if either ш or u or both equal or exceed c= (.95)/%=.974; 
the case 2 procedure is: reject Ho if both u and ш» equal or exceed 

c=1—(.05)/2=.776, Case 1 was proposed earlier by Tippett [5]. 
In the following sections, certain bases for selecting methods of 
combination for particular ptoblems will be developed. 
» 
í 


2. A GENERAL CONDITION FOR ADMISSIBILITY OF 
METHODS OF COMBINATION 


The following condition is readily seen to be satisfied by each of the 
proposed methods described above: t 

Condition 1: If Не is rejected for any given šet of u,’s, then it will 
also be rejected for all sets of u,*’s such that u;* <и; for each i. 

Any method of combination which failed to satisfy this condition 
would seem unreasonable. In fact, it is not difficult to-prove that the — 
best test of H o against any particular alternative Нв of the kind de- 
scribed above satisfies Condition 1. (4 proof is mu in the Appendix.) 


> 


COMBINING INDEPENDENT TESTS OF SIGNIFICANCE 503 


Since Condition 1 is satisfied by so many possible methods of com- 
bination, the question arises whether any further reasonable condition 
can be imposed to narrow still further the class of methods from which 
we must choose. The answer is no: So long as we consider the problem 
in the present generality, and not with reference to a particular kind 
of testing problem, there areno restrictions on the possible forms of the 
density functions g;(u;) except that they be non-increasing. And it can 
be shown (see the Appendix) that for each method of combination 
satisfying Condition 1, we can find some alternative Hg represented by 
non-increasing functions gi(u1), - • - , gs(ux) against which that method 
of combination gives a best test of Ho. 

These considerations prove that to find useful bases for choosing 
methods of combination, we must consider further the particular 
kinds of tests to be combined in any given problem. In the following 
sections, it is shown that certain methods are optimal for certain im- 
portant categories of testing problems. 


3. DISTRIBUTIONS OF THE KOOPMAN FORM 


Nearly all of the density functions and discrete probability distribu- 
tion functions encountered frequently in applied statistics can be 
written in the so-called Koopman form, which is 


(6) f(a, 0) = c(8)a(0)*B (2) 


where 0 is a parameter of the distribution and z is ав observed value, 
and a, b, c, and ¢ denote arbitrary functions. Examples are 
1. The binomial: 


M seo =(") oa - 9 = ory (e) 


2. The normal, with known (say (init) variance and mean 6: 


1 1 " 2 

8 = — eie = 60912602672 1°, 
8) Kt, 8) = Fae eee 

Other examples are the Poisson and exponential distributions and 
the normal distribution with known mean and unknown variance. 

Consider a problem of combining independent significance tests, 
each of which is a test on a distribution of the Koopman form (all k 
distributions need not be of the same sort however; for example, one 
Might be on a normal mean and another on a binomial mean, as in the 
illustration in Section 2 above)? A method of combining these tests 


564 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPBEMBE, à 


will be equivalent to a test of a hypothesis specifying the values of. 
parameters, ; 


Ние = 85.0 = 0,0, 


on the basis of the observed values of the statistics й, ---, t For 
such problems, a minimal criterion for the reasonableness of a test is 
known. “Reasonableness” is used here in the sense of admissibility of 
a test, which may be defined as follows: A test is admissible if there is 
no other test with the same significance level v'hich, without ever being 
less sensitive to possible alternative hypotheses, is more sensitive to 
at least one alternative. In other words, an admissible test is one which 
cannot be strictly (that is, uniformly) improved upon. A necessary 
condition for admissibility of a test of Ho in our problem is that the 
acceptance region of the test (that is, the values of (4, ©- -, tx) for 
which the test accepts Hy) be convex. (A region is convex if the line 
segment connecting each pair of points in the region lies entirely in the 
region.) This is shown in [1]. 

We may illustrate both this condition for admissibility and its appli- 
cation to methods of combination by considering the problem of com- 
bining two tests on means of normal distributions with known (say 
unit) variances. (The performance of Fisher’s method when applied 
to such a problem has been considered by Wallis (pp. 237-39 of [7]) 
and by Pearson (p. 142 of [4]). Let z; denote the mean of a sample of 
nı observations Obtained in an experiment in which the underlying 
population mean had the unknown value ш; let be the mean of a 
sample of n; observations in a similar experiment in which the unknown 
population mean was ys. In this case any method of combining tests of 
the two hypotheses ш=0 and m=0 is equivalent to а test of Ho: . 
щ=ш=0; then H4 would specify 4:=42~0; and Hg would specify _ 
that either m or us or both are not zero. Let й = ут and = упо 
Then any method of combining the tests on ш and д» can be represented —— 
as a test of Ho by its critical region in the (4, t) plane. Each of ће 
methods of combination described above has been applied to the M. 
present problem, and the critical region corresponding to each method. .— 
is illustrated in the figures below. The significance level a=0.5 was | 
used throughout. The tests on ш and д» to be combined were taken | 
first to be against two-sided alternatives (Figures 14) and then against: 4 
one-sided alternatives (Figures 6-9). In each case the, critical region 
was obtained by first determining the values of u, and uz for which the 
method of combination considered would reject Ho at the .05 signifi- 
cance level, and then plotting the corresponding values of ћ and ё by 


E 
^ 


БЫ INDEPENDENT TESTS OF SIGNIFICANCE 565 

‘uke of the equations relating the £s to the u,’s. These equations are, 

for the two-sided tests, 

Ae T edo, fied 2, 
V2r J иг Ap 


and for the one-sided tests, 


(9) us 


for i ={1, 2. 
(10) or i ={1, 2 


Fig. 1. Wilkinson's Method, Case 1. 


We can now apply the condition for admissibility of a test described 
above: The acceptance regions obtained by Wilkinson’s method, case 
2, and by Pearson’s method are not convex. Hence they represent tests 
of Hy, and cofresponding methods of combination for the present 
Problem, which can be strictly improved upon by other tests and corre- 
Sponding methods of combinatién® Present knowledge does not provide 


є • 


566 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


methods of finding tests which actually do strictly improve upon a 
given inadmissible test in problems like the present one. However, it 
seems advisable in selecting tests to restrict consideration to the class 
of admissible tests, and to select from this class a test which seems to 
have relatively good sensitivity (power) against the range of alterna- 
tives of interest. 


t 


SUR 


Fic. 2. Wilkinsca's Method, Case 2. 


It is shown in [1] that, for a category of problems including the 
present one, convexity of the acceptance region is a sufficient as well 
as necessary condition for admissibility. 

The remaining two methods of combination, Wilkinson’s case 1 
and Fisher’s, correspond to admissible tests. Inspection of Figures 1 and 
3 suggests that each is fairly sensitive to departures from Hp in all 
directions; that Fisher’s method comes close to that test of Ho (repre- 
sented in Figure 5) which, if n; —7 and if the seriousness of a depar- 
ture from Horis measured by ш?-Е дь, is the best test at the .05 level 
(as Wallis noted in [5]); and.finally th&t Wilkinson’s method, case 1, 


s > С 


COMBINING INDEPENDENT TESTS OF SIGNIFICANCE 567 


gives a relative concentration of sensitivity to alternatives in which 
the departure from Hp occurs in just one of the parameters. Similar 
observations can be made in Figures 6 and 8. Hence, it seems war- 
ranted to make a choice between the two methods remaining under 
consideration on the basis of a subjective appraisal of the context in 
which a problem like the present one actually occurs; probably: in 
most cases Fisher’s method would be preferred. 


Fic. 3. Fisher's Method. 


Having considered in detail a problem involving a particular distri- 
bution of the Koopman form, we proceed now to show that similar 
Considerations apply tò the whole class of such distributions. It can be 
Verified easily that if Wilkinson’s methods are used to combine tests on 
any such distributions, the result corresponds to а test whose accept- 
ance region has a rectangular boundary like those in Figures 1, 2, 6, 
and 7, and is convex only in case 1. Hence, only case 1 of Wilkinson’s 
method corresponds to an admissible test for certain of the Koopman- 


e E » 


568 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


form distributions being considered. The remaining cases of the method 
correspond to inadmissible tests for all Koopman-form distributions. 

With little more difficulty it can be verified that Pearson’s method 
does not give a test of Ho with convex acceptance region for any 
Koopman-form distributions (consider the three points in the (h, &) 
plane corresponding to (из, из) = (1—с, 0), to (ш, uz) = (0, 1—с), and to 


"SENS 
LS 


Его. 4. Pearson's Method. 


(ш, ш) =(1— ус, 1— Vc), for the case (БВ =2). Thus, Pearson’s method 
also may be removed from consideration as inadmissible for Koopman- 
form distributions. Fisher's method does seem to give tests of Ho with 
convex acceptance regions for Koopman-form distributions; considera- 
tion of the points in the (5, t) plane corresponding to (ш, иг) = (1, с), 
to (us, uz) = (c, 1), and to (ш, uz) = (\/с‚ ус) suggests this, and for par- 
ticular distributions it may be possible to verify it fully without too 
much difficulty. For example, for 


^5 
» 


‘COMBINING INDEPENDENT TESTS OF SIGNIFICANCE EOS E 569 
l 1 Vies j 
P Тот vie S; 


to combine two one-sided tests on 0 based on one observation each, we 
have i 
(12) U; = err !%, fori = 1, 2, 
and the critical region мүм <с corresponds to a test with the convex 
acceptance region 2;--25 S — 6» log c. j 


Fra. 5. Best Symmetric Test Against Hz. 


4. CONCLUSIONS . 


' While there is no single method of combining tests which is best for 

all Problems, it appears that to combine independent tests on Koop- 

Man-form distributions (these include most distributions commonly 

occurring in applied statistics) one should choose between Fisher’s 
& © 


= 


570 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1984 


method and Wilkinson's method, case 1 Fisher's method appears to 
have somewhat more uniform sensitivity to the alternatives of interest 
in most problems. For any particular distributions, investigations may 
be made paralleling those above to obtain a still more conclusive basis 
for choice of a method of combination. 


M 


Fic. 6. Wilkinson's Method,*Wase 1. One-sided Alternatives. 


APPENDIX 


| To prove that, as stated in Section 2 above, every test of Ho which 
is best against some particular alternative specifying non-increasing 
densities, satisfies Condition 1, we use the well-known fact, proved for 
example in [3], that any best critical region consists of points satisfying 


E gi) += + gelu) 
filu) got Jug) 30 


ъа 


(13) c some constant. 


» 


COMBINING INDEPENDENT TESTS OF SIGNIFICANCE 571 


Now fi(ui) 21 for 0<5и;51, 7=1, · · ·, k. Hence, = (и) « - + gius). 
As the g;(u;)'s are non-inereasing, gi(w1’) * - - gu(wx’) Zi) ~ = + gu(ux) 
>с if (u,--:- w) is in the best critical region and if ш; <и; for 
i=1,---, k. Thus Condition 1 is satisfied. 

However, in general Ив (and even H4) will include a whole set of 
possible forms of the g;(u;)'s, and it is not true in general that there will 


X 


Fra. 7. Wilkinson’s Method, Case 2. One-sided Alternatives. 


е, 


exist а single test of Ho which is uniformly best against all possibilities, 
This is illustrated most simply in the following case under H в: If 
(ш) is uniform and gs(us) is nonuniform, a best critical region consists 
of all (ш, w) such that иг &c, c some constant; if gius) is uniform and 
(ш) nonuniform, a best critical region consists of all (ш, ш) such that 
її Sc’, c some.constant; thus, there is not a single critical region which ` 
ів best against each alternative. It can be verified directly that every 
best test of Н, o against a “Bayes mixture” of simple alternatives under 


E 
^ 


572 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954. 


Н» also satisfies Condition 1. It follows, as shown by Wald in [6], 
that under general assumptions Condition 1 is a necessary conditio 
for admissibility of a test of Ho against a composite alternative Hz. 
We shall show next that, as stated in Section 2 above, each method: 
of combination meeting Condition 1 is best against some particular 
alternative hypothesis Hz. Taking k=2 for simplicity, any critical 


ns \ 


Fro. 8. Fisher's Method. One-sided Alternatives. 


region w of values of (us, u), if it satisfies Condition 1, can becharae- | 
terized by giving its boundary function ш (ш), a non-increasing func- 4 
tion such that w consists of all points (ш, u» in the unit square К 
0<ш<1, 010 <1 for which ш <ш(ш). Let мә(ш) be any such bound- — 
ary junction. Let go(w)=3(2—m) for 0<<1, and let gua) 
=$ce(2—1(m))-! for 0 €u; €1, where c is determined by the condition а 
that figu(w:)dui=1. A best critical region for testing Ho against the — 
alternative (ш), дг(и) is the set w' on which gi(w:)ga(ue) >c: But. 4 


i 


COMBINING INDEPENDENT TESTS OF SIGNIFICANCE 573 


бш) (и) =с(2—и)/(2—иш(щ))>в if and only if ш<ш(ш). Thus 
the arbitrarily given boundary function us(w) characterizes a best 
critical region w’. 

(Similar methods give analogous results for the problem of testing 
Н, against На, with Condition 1 now strengthened by the requirement 
that the boundary function иг(ш) be symmetric about the line u; =u.) © 


Его. 9. Pearson’s Method. One-sided Alternatives. 


REFERENCES 


[1] Birnbaum, Allan, “Characterizations of Complete Classes of Tests of Some 
Multiparametric Hypotheses, with Applications to Likelihood Ratio Tests,” 
to be published in Annals of Mathematical Statistics. 

[2] Fisher, R. A., Statistical Methods for Research Workers, Fourth and later 
hnc Edinburgh and London, Oliver and Boyd, 1932 and later, Section 

1.1. 

[3] Neyman, J., First Course in Probability and Statistics, New Y 

and Company, 1950, Section 5.3.1, 304-8. в 


с © 


ork, Henry Holt | 


s 
E 


574 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


Fie, 10. Best Test of Ho Against На. One-sided Alternatives. 


[4] Pearson, E, S., "The Probability Integral Transformation for Testing Good- 
ness of Fit and Combining Independent Tests of Significance," Biometrika, 
30 (1938), 134-48. 

[5] Tippett, L. Н. C., The Methods of Statistics, First Edition, London, Williams 
and Norgate, Ltd., 1931, Section 3.5, 53-6. 

[0] Wald, Abraham, Statistical Decision Functions, New York, J. Wiley and 
Sons, Inc., 1950, Theorem 3.20, 101. 

[7] Wallis, W. Allen, “Compounding Probabilities from Independent Signifi- 
cance Tests," Econometrica, 10 (1942), 229—48. 4 

[8] Wilkinson, B., ^A Statistieal Consideration in Psychological Research," Д 
Psychological Bulletin, 48 (1951) 156-7. 


© 
е 


о 


MINIMUM LIFE IN FATIGUE 


А. M. FREUDENTHAL AND E, J. GUMBEL 
Columbia University 


N CONVENTIONAL fatigue testing a specimen is repeatedly stressed in 

bending, torsion or tension-compression during imposed force-cycles 
of constant amplitude. (he number N of cycles at which the specimen 
breaks, which is a function of the applied stress-amplitude S, is the 
observed variate [1]. The problem is to obtain, for specimens of speci- 
fied material, shape and size, the probability of surviving up to N 
repetitions of a specified stress-cycle in a given testing procedure. 

In a previous paper [2] fatigue was interpreted as an extremal phe- 
nomenon. A statistical scheme for the analysis of fatigue failures was 
developed and applied to the few observed series which satisfy the 
criteria for the applicability of any statistical procedure. The prob- 
ability of surviving up to N cycles was obtained from the asymptotic 
theory of smallest values of a non-negative statistical variate. In this 
derivation it was assumed that a non-zero probability of failure existed 
even at the first cycle. This led to a good fit of the theory to tests made 
on copper and on aluminum at certain stress levels. However, it did 
not fit well enough the tests at other (lower) stress levels or of other 
metals. It appeared that such metals under certain stress levels would 
survive with certainty a substantial number of stress cycles, which 
thus represents a threshold value of the phenomenon. 

Therefore a minimum number of cycles (“Minimum life”) is now 
introduced into the survivorship function. Although this generalization 
does not change the fundamental nature of the probability function, a 
new estimate of the parameters bétomes necessary. The extremal dis- 
tribution function used in the following has already been introduced 
on a purely empirical basis by Weibull [6, 7] in his analysis of the dis- 
tribution of the stress amplitude S for constant values of N. 


© 1, THE LINEAR THEORY 


In the previous theory [2] the probability IN) s of surviving N cycles 
under the stress S was j 


(1.1) : №) в = exp [— (N/Vs)**]; 
with the boundary conditions yalid for all values of S^ 
(1.1’) * .10)s = 1; [(Ф)в = 0. 


575 y " е 


576 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1054 


In equation (1.1) Vs is the characteristic number of cycles correspond- 
ing to the probability 


(1.2) UVs)s = 1/е 


and 1/og is proportional to the standard deviation of the logarithms of 
the number of cycles. If we write 


(1.3) in[- In (N)s] = y; 1/as' = 0.48429/os 


where /n stands for the natural logarithm, the relation (1.1) takes on 
the linear form 


(1.4) = as'(log N —log Vs); ҚМ) = exp [— е] 


where y is a reduced variate without dimension and log stands for the 
common logarithm. The survivorship function (1.4) which is known by 
the actuaries as the Gompertz function has recently been tabulated 
by the National Bureau of Standards [8]. 

Instead of the theoretical limits 0<N< c» practical limits for the 
survivorship function are given by the interval 


(1.5) Үве—12:5/а < N < Vge2-s0/as 


for the number of cycles. In this approximation probabilities of the 
order 10-5 are neglected. ° 

For homogeneous material and testing procedures, the probability of 
surviving a fixed number of cycles N decreases with increasing stress 
amplitude S. Since the relation (1.4) between log N and y is linear and 
ag’ is the slope of this line the as are constant and independent of 
S which means that the probabilities (N)s traced on extremal prob- 
ability paper against log N are parallel straight lines, which therefore 
cannot intersect. It follows that the estimates of ws should be constant 
within errors of random sampling. However the acceptable domain of 
variation of the estimates cannot be established a priori since it de- 
m^ upon the spacing of the stress levels, i.e., on the experimental 

esign. E 

The first equation (1.4) gives a graphical criterion for the validity 
of the theory: The logarithms of the observed nurabers of cycles N at 
fracture traced on extremal probability paper against the reduced 
variate y should be scattered about the straight line 


(1.6) log N = log Vs + y/os'. 


In this formula the y corresponding to the observed numbers 
Nn(m=1.2 - - п) are obtained from the plotting positions [2] 


2 
» 


pe———É——UE.——m————————MÁ'———— ESO EN эй 


US i 


MINIMUM LIFE IN FATIGUE 577 
(1.7) UNn)s = 1 — m/(n + 1) 


where m stands for the ranks of the observed numbers Nm arranged in 
increasing magnitudes and n is the total number of specimens tested. 

It was shown [2] that the theory (1.1) fits torsion fatigue test results 
for copper and aluminum at relatively high stress levels. However, for 
low stress levels the test results approach a curve which, for small 
numbers of cycles, i$ definitely bending to the right (upwards). This 
means that fracture can occur only after a certain number of cycles. 
This number is henceforth called the minimum life. It is a physical 
constant for given material and testing procedure and its estimate is 
subject to statistical variations. 

In addition to the graphical criterion, there is a numerical criterion: 
If the theory (1.1) holds, the arithmetic and geometric standard devi- 
ations o(N) and o(log N) are related as shown in Table I of the previ- 
ous paper [2]. This relation, traced in Figure 1 leads to the following 


o e 2 EI 5. 
EOMETRIC STANDARD DEVIATION 5009, М) 


e P х, 
Fra. 1. Criterion for the validity of the linear theory. 


< . 


578 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


procedure: We estimate the characteristic number Vs, calculate the 
standard deviation s(N) and the geometric standard deviation s(log N), 
and check whether the quotient s(N)/V corresponds to the value 
prescribed by the graph from the sample value s(log Л). If this rela- 
tion is not fulfilled, the assumption cannot be sustained. 


2. THE GENERAL THEORY 


If the assumption that the minimum life is zero is refuted by one of 
the above criteria, the asymptotic probability function (1.1) is used 
limiting the variate N by the condition N= No,s, whence 


(2.1) UN)s = exp [ E ( 05 y] 


with the properties 
(2.2) (Уз)в = le  UN)o,s)s = 1. 


For a given stress amplitude S all specimens survive any number of 
cycles up to the minimum life No,s. For values approaching No,s the 
survivorship function (2.1) is bent to the right and approaches a 
straight line perpendicular to the N axis as shown in Figures 7 and 9. 

In (2.1) Vs and N»,s are.parameters of location, have the dimension 
of N and the relation Vs>No,s=0 while 1/as is a parameter of scale 
without dimensiop. In the case No,s=0 the formulae of the previous 
paper ]2[ are obtained. Within the framework of the extremal theory 
the generalization (2.1) of (1.1) is legitimate since a linear translation 
of an extreme is still an extreme. 

The parameter Уз has the same meaning in the previous (linear) and 
the present (general) theory. Since it corresponds to a common fixed 
probability it decreases in both theories with increasing values of 8. 
However, within the two theories different estimators have to be used. 
For all values of S where the minimum life No,s differs from zero, it 
decreases with increasing S. 

In the special case as=1, 1/а' = 0.43429 the probability of survival 
(2.1) degenerates into an exponential function. This probability traced 
on semi-logarithmie paper is a linear function of the numbers of cycles. 
The parameter Vs coincides with the mean Ns. The standard deviation 


o(N)s is 
(2.3) A — ce(N)s = Vs — Nos. 


"Therefore, the minimum life Муз maybe estimated in this case by the 
difference of the mean and the standard deviation of the number of 


> э 
> 


LIFE IN FATIGUE 579 


es. Another estimation based on the smallest number of cycles, | 
sh in this case is the most precise one, was given by J. Neyman and 

Pearson [4]. This model seems unrealistic for fatigue Observation 

ause the exponential function implies that the expectation of 

re life is independent of the preceding number of cycles, i.e., of the 

tory.” This assumption is compatible with certain physical processes, 

as radio-active decay, but not with fatigue. In fact all observations 

able on fatigue lead to estimations for as which exceed unity. 

T For any value of ag the median life Ns obtained from (2.1) is 


Ws — Nos = (Vs — Nos)(1n 2)*/4s. 


lo obtain the modal life X s consider the distribution p(N)s of the num- 
of cycles at failure obtained from (2.1) as i 


as N—No,s\*s"! N—No,s\%8 
Шы Eit 
erentiation with regard to N leads to the mode 
) Ёз — Nos = (Vs — No,s)(1 — 1/ав) в. 
4 mode exists only for 1/as «1; 1/as’ «0.43429 and 


precedes $ 
) the mode |equals | the median if 


exceeds 
> > 
1/as | = 0.30685}; l/as' | = 0.13326) 
< < 


lor as —3.25889 the distributiog,(2.5) is nearly symmetrical, Three 
er pseudosymmetrical cases will show up ater. — d 
The density of probability at the mode increases with os. This is 


wn in Figure 2 where (N — No, s)/ (Из No,s) is used as abscissa. 
Ws and the mode Ñs converge 


acteristic number V s. 
t will be shown in paragraph 3 that the same holds for the mean Ws. 


580 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


ec 3, <]=./44 


E _ 


0S2, //x:-.217 


DENSITY OF PROBABILITY 


REDUCED NUMBER OF CYCLES 
N -Nos 
М, 
Fig. 2. Influence of ag on the shape of the distributions of Ns. 


an estimate of the location parameter Үз is more convenient to char- 


acterize the fatigue failure of a given material than the mean, the mode, 
or the mediar?. 


For the graphical representation of the numbers N at fracture the 


a 


: MINIMUM LIFE IN FATIGUE ; { t f 581 


reduced variate y defined in (1.3) is used. Equation (2.1) then becomes, 
in analogy to (1.6), 3 


(2.8) log (N — Nos) = log (Vs — Nos) + y/os', 


where ws’ and ag are related by (1.3). In the previous (linear) theory the 
parameter 1/æs' is the slope of log N plotted against у. In the present 
theory it is the slope of log (N—No,s) against y. Within the previous 
theory o; is independent of S, within the present theory it may depend | 
upon S. If Vis large compared to Nos the logarithms of the number of. 
cycles at fracture traced on the extremal probability paper against y 
are practically linear as long as N is in the neighborhood of Уз. How- 
ever, if No,s is of the same order as Vs the curve is bent to the right 
(upwards) for N « V. For the same values of Vs and No,s, the asymp- 
totic value No,s is approached more quickly if 1/as becomes larger. | 
Two curves are parallel if they have the same value of 1/as and (Vs 
— Nos) although the parameters Vs and No,s may differ. 

The limiting condition (1.5) leads from (2.8) to a number of cycles | 


(2.9) Nos = Nos + (Vs — No,s)e?*/8 


for which /(N.)s is practically zero. ; 

For a large number of cycles and small probabilities of survival a 
relatively small increase in the number of cycles considerably reduces 
the probability of survival. This holds for the linear and the general 
theory and corresponds to the popular statement: of the straw that 
broke the camel’s back. For high probabilities of survival a consider- 
able decrease in the number of cycles is necessary in the linear theory 
in order to increase the probability of survival by a small amount, while 
in the general theory a small decrease in the number of cycles has a large 
influence on the probability of survival 

It has often been assumed that the logarithms of the number of cycles 
to fracture in fatigue are normally distributed. However, in this case, 
the probability of survival converges in the same way to zero and to 
unity, Therefore the use of this distribution for extrapolation does not 
Seem to be legitimate. А 


° 
3. ESTIMATE OF THE PARAMETERS 


Since the relation (2.8) between log N and y is no longer linear as | 
Was the case for No,s=0 the graphical estimate of the parameters given 
Previously [2] is no longer feasible. Instead the classical method of 
Moments advocated by Weibull [7] for the analysis of breaking 
strengths is used. The special case ов 21 was settled in (2:3). 


e « 5 


ur 
E 


582 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER mes fis 


In the general case as~1 the reduced moments of order k obtained ia 
from the distribution (2.5) are the gamma functions т 


N — Nos j 
————)2rü-dk 2 ia 
( Vs— v Porge б 


For k=1, 2, 3 three equations are obtained which may be used for the 
estimation of the three parameters. The mean, 
(8.2) Ns — Nos = (Vs — Nos)T(1 + 1/as), 


depends upon the three parameters and the probability at the mean 
depends upon as. The relation between the mean and the characteristic 
number of cycles is 


> Vsifl/asZ 1; 1/05 2 0.43429. 


For increasing values of ws the mean converges to the characteristic 
number V. The variance c*(N)s obtained from (3.1) and 3.2), 


(8.3) c*(N)s = (Vs — Nos? [T(1 + 2/as) — T*(1 + 1/os)], 


also depends upon the 3 parameters. The corresponding sample vari- 
ance s*(N)s-— 85° is obtained by the usual procedure as 


(3.1) 


(3.37) ss? = n(Ns? — Ns?)/(n — 1). 
The skewness 4/8;,; defined by 
(3.4) i vVBis = mses? 


where џз,ѕ is the third central moment is 
(8.5) V61,s=[00+3/as) -3T(1--2/as)T (1--1/as) -21*(1--1/a9)] 

[Г(1-Е2/ев) —F2(1+1/es) ]-/*. 
and depends only upon the pararüeter 1/os. If the population XR 
Ув is replaced by the sample value 

iss —1) Ns — ЗМ? NS 
(8.5) owes vnm = 1) Ns OMe Ns +215 

n—2 (Ns? ЛҮ үне 
an estimate of 1/as is obtained. To facilitate this procedure, the right 
side of equation (3.5) as a function of 1/os is given in Table 1, cols. 
4 and 1.1 The value of 1/os' in equation (2.8) is obtained then from 
(1.3). 
The two remaining parameters of location, the characteristic number 


1 This table was calculated by Gladys R. Garabedian of Stanford (ену: The authors take 
this occasion to thank her for this important contribution. 


E 
» 


ar ~~ 


MINIMUM LIFE IN FATIGUE 583 


Vs and the minimum life No,s, are simple to estimate. Combination of 
(3.3) and (3.2) leads to 


(3.6) Vs = Ns + osA(as), 


where the standardized distance from the characteristic number to 
the mean 


(8.6) Alas) = [1 —T(1+1/as)] [TQ + 2/as) — TA + 1/as) ]-* 


is given in Table 1, col. 3. 

Since 1/as is estimated from (3.5), the parameter Vs may be esti- 
mated from (3.6) after replacing the population mean and standard 
deviation by the sample values. The result can be checked from the 
observations traced on the extremal probability paper with the help 
of the first equation (2.2). 

To estimate the minimum life the value of Ws given in (3.6) is intro- 
duced in (3.2) whence 


(3.7) Nos = Vs — osB(as) 


where the standardized distance from the characteristic number to 
the minimum life 


(8.7) B(as) = [Г(1 + 2/as) — Гї(1°+ 1/as) ][-* 


is given in Table 1, col. 2. For the estimation of Mo,g we use the previ- 
ous estimates of 1/as and of Vs and replace the population value ов 
by the sample value ss. 

Thus the estimate @s is obtained from уб, з, equation (3.5’), with 
the help of Table 1. The two other estimates are from (3.6) and (3.7), 


(3.8) Vs =Nsg+ssA(@s); Nos = Vs — ssB(âs). 
e 


The minimum life is thus estimated directly, without using iterated 
procedures, 

The result may be checked by another estimate based on the observed 
smallest number Nj of cycles at fracture. Its plotting position 1—1 /(n 
+1) for n observations obtained from (1.7) and equation (2.1) lead to 
the unbiased estimate 


Nn + v — V 
(n 4- DU& —1 
This estimate is always smaller,than the observed smallest number of 


Cycles. Of course, this method also requires the previous estimate of the 
two other parameters 1/as and Vs. 


(887 : М, = 


$ е 


584 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 | 
TABLE 1 
ESTIMATION OF THE THREE PARAMETERS 


1 2 à 3 4 
Multiple of standard deviation 
Scale Reduced 
parameter for minimum life Nos for parameter Vs 3d moment 
1/as B(as) A(as) УВ: (оз) 
equ. (3.7/) equ. (3.6^) equ. (3.5) 
.01 78.9817 .4481 —1.0813 
.02 39.9890 .4461 —1.0249 
.03 26.9862 .4439 — 7.9707. 
.04 20.4808 .4416 — .9185 
.05 16.5744 .4392 — .8080 
.06 13.9673 .4366 = .8191 
.07 12.1029 .4339 — 37717 
.08 10.7024 .4310 — .7258 
.09 9.6114 .4281 — .6811 
.10 8.7369 .4250 — .6376 
„0 8.0199 .4219 — .5953 
12 7.4209 .4186 — .5540 
.13 6.9128 .4152 = .5137 
14 6.4761 .4118 — .4743 
15 6.0965 .4082 — .4357 
.16 5.7633 .4046 — .3980 
nui 5.4682 .4008 — .8610 
.18 5.2050 .3970 — .3247 
19 ji . 4.9686 .3931 — .2891 
.20 4.7549 2 +3891 — 2541 
i 
+21 4.5608 ^^ . .8850 — .2197 
.22 4.3835 .3809 — .1858 
+23 4.2209 .3767 — .1525 
.24 4.0711 .3724 — .1196 
.25 3.9326 i -3681 — .0872 
.26 3.8041 -3637 ? — .0553 
.27 3.6844 .8592 — .0237 
.28 3.5727 .3547 + .0075 
1:29 3.4680 -3501 .0383 
.30 3.3698 .3455 ^ .0687 
.31 т | 8.2774 .3408 .0989 
.32 3.1901 = ” $3361 5 .1287 у 
.38 3.1077 +3313 5 .1583 $ 
.94 3.0296 -3265 .1876 М 


935 ^ 2.9554 .8217 .2167 


| MINIMUM LIFE IN FATIGUE 


1 2 3 4 
| Multiple of standard deviation 
Scale -on e O Reduced 
parameter for minimum life No,s for parameter Vg 3d moment 
1/as B(as) A(as) A/ Bias) 
equ. (3.7^) equ. (3.6^) equ. (3.5) 
.36 ' 2.8849 :3168 .2455 
| 187 2.8178 .3119 .2741 
.38 2.1537 .3069 :3024 
.39 2.6925 .3019 . .3306 
.40 2.6339 .2969 .3586 
E 2.5778 .2919 .3865 
.42 2.5239 .2868 .4141 
.48 2.4721 .2817 AT 
e 2.4224 .2766 .4691 
E 2.3755 .2715 .4963 
.46 2.3282 .2663 .5235 
47 2.2836 .2612 .5505 
.48 2.2406 .2560 .5775 
.49 2.1989 .2508 .6043 
.50 2.1587 .2456 .6311 
.51 2.1196 .2404 .6578 
.52 2.0818 .2352 16845 
58 2.0451 .2299 .7110 
.54 2.0095 .2247 .7376 
.55 1.9749 .2195 . 7640 
56 1.9412 ‚2142 .7905 
.57 1.9085 .2090 .8169 
58 1.8767 2038 .8483 
59 1.8456 $c .1985 .8097 
60 1.8154 .1933 .8960 
61 1.7859 .1881 .9224 
62 1.7571 .1829 .9488 
-63 1.7290 1777 +9751 
64 4.7016 .1725 1.0015 
65 1.6748 .1673 1.0279 
-66 1.6486 .1621' 1.0544 
:67 1.6230 .1570 1.0808 
-68 1.5870 .1518 21.1073 
.69 1.5734 -1467 1.1338 
-70 .1.5494 ^ * 4416 1.1604 
71 1.5259 .1365 1.1870 
72 1.5029 1314 1.9137 ° 


586 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 | 


1 2 3 4 
Multiple of standard deviation 
Scale Reduced 
parameter for minimum life No,s for parameter Vs 3d moment 
l/as В(оѕ) А (as) V Bilas) 
equ. (3.7^) equ. (3.6/) equ. (3.5) 
48 1.4803 .1263 1.2404 
.74 1.4581 +1213 , 1.2672 
-75 1.4364 .1163 1.2941 
.76 1.4151 .1113 1.3210 
77 1.3942 .1063 1.3480 
48 1.3737 .1013 1.3751 
449 1.3535 .0964 1.4023 
.80 1.3338 .0915 1.4295 
.81 1.3443 .0866 1.4569 
.82 1.2952 .0818 1.4844 
.83 1.2765 .0770 1.5119 
.84 1.2580 .0722 1.5396 
.85 1.2399 .0674 1.5674 
.86 1.2220 ‚0627 1.5958 
.87 1.2045 .0580 1.6233 
.88 1.1872 .0533 1.6514 
.89 ° 1.1703 .0487 1.6797 
.90 1.1536 .0441 1.7080 
‚91 1.1371 .0395 1.7366 
+92 1.1209 .0350 1.7652 
.93 1.1050 .0305 1.7940 
.94 1.0893 ; .0260 1.8230 
.95 1.0738, 0511 1540216 1.8521 
.96 1.0586 ‚0172 1.8814 
.97 1.0436 .0129 1.9108 
.98 1.0289 .0085 1.9403 
.99 1.0143 .0042 1.9701 
1.00 1.0 0.0 2.0 
1.0 17 0. 2. 
1.2 .7522 — .0766 2.6400 
1.4 5633 — .1364 3.3820 
1.6 .4184 o 797 4.2021 
1.8 .3076 — .2081 5.3235 
e 
2. 2730 ^ — .2236 6.6188 
3. 03824 — .1912 19.5849 
4. .00502 — .1154 60.0917 
5 00053 — .0626 190. 


MINIMUM LIFE IN FATIGUE 587 
Finally, the theoretical numbers N corresponding to given probabili- 
ties ((N)s are obtained from (2.8) as 
(8.9) N = Nos + Ws — Жо, в) еі. 
4, INFLUENCE OF THE PARAMETERS 


To show the influence of the parameters the values No=No,s=1,000; 
1/a'=1/as' =0.1 and different values of Vs are chosen. In this simpli- 


: 


АМА +) 
5 ПЕВ 


8.950 


.00/ 
4 42 1416 1820 25 5, 354455 67890 12 “ 
* LooarirHmce Scare (M In 1000) 


Fic. 3. Theoretical survivorship functions I(N)s for constant | 
No,s=1.0, 1/.s’=0.1 and various values of Vs. 


588 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


fied scheme it is assumed that the parameter œ and the minimum life 
No are both independent of S, while os may and No,s will depend upon 
S. The numbers of cycles at fracture N in 1,000 obtained from (2.8) 


(4.1) log (N — 1) = log (Vs — 1) + 0.1 y 


are traced in Figure 3. The graph shows how the survivorship functions 
for different values of Vs converge to unity for a common value of No. 
Within the observable part 0.0476 €1(N)5:0.9524 for n=20 speci- 
mens obtained from (1.7) the survivorship functions look fairly linear 
and the slopes seem to increase systematically with decreasing V s, i.e., 
with increasing stress levels, while in reality it was assumed that the 
minimum life is invariant against changes in S and that the parameters 
1/ов are constant. Thus the graph may serve as a warning against 
relying too much on the graphical representation for the usual small 
samples. 

Since the normal distribution has sometimes been used in connection 
with fatigue observations, it is worthwhile to analyze under what con- 
ditions the distributions (2.5) look symmetrical. In addition to the 
pseudo-symmetrical case (2.7) where the median and the mode co- 
incide three other pseudo-symmetrical cases exist. Comparison of 
equations (2.4), (2.6), and (3.2) shows that the mean is equal to the 
median / 


(4.2) . Ns = Ñs if 1/as = 0.29075, 
and that the mean is equal to the mode 
(4.3) Ws = Ñs if 1/ов = 0.30189. 


Table 1 shows the existence of a fourth pseudo-symmetrical case. The 
third central moment is zero for 1/as=0.27760. It follows from Table 1 
that the distributions look symmetrical if the skewness is near to the 
interval 


(4.4) 0S в < 0.1. 
Then the scale parameter is near to the interval 
(4.5) 0.27 < 1/оз < 0.31; 3.2 < as <3.7; 0.12 < 1/as’ < 0.13. 


The four values of œs are so close one to another that no practical 
distinction between these cases is possible. Of course none implies & 
normal distribution of the number of cycles at fracture. 

The values of Table 1 are drawn in Figures 4, 5, and 6. Figure 4 
shows the parameter 1/оѕ and 1/os' and the stendardized distances 
from the characteristic number to the mean and to the minimum life 


x 


n———— Se" 


— CHO А 


JM LIFE IN FATIGUE i i | 589 
Alas) = (Vs — Ws)/es; B(as) = (Vs — Nos)/es: 


etions of the skewness УВ; з. The distances decrease with in- 
g skewness. Within the interesting domain of 1/os the distance 
"the mean to Vs is smaller than the distance from No,s to the 
Figure 5 compares the standardized distance from Vs to No,s 


standardized distance from Ws to No,s. Both are traced as func- 


45| 
HA 
X 
v a [i : 
ЕИ | болео! cose Me i 
à 
Ў | 
i È 
E 
S 


Nnm 
\25 30 35 


Lint 
мз 40 oS o p RU 90 
SKEWNESS ҮЙ, M o> 


Fra. 4. Parameter 1/as and standardized distances A(as) and 
B(as) as functions of skewness. (See Table 1.) 


of the skewness and of the parameter 1/as and 1/as’. Finally. 
e 6 which shows the standardized distance from Vs to Nos as 
ction of 1/а and 1/as' facilitates the estimation of the minimum 


e estimations No,s, Vs, and 1/os and the three statistics Ns, 
Уо s are related by the sample analogs of the three equations (8.5), 
гапа (3.7) which will now be analyzed. The estimation of the 
neter 1/as depends only on the statistic Vbi,s- The population 
/as increases with the skewness as shown in Figure 4. In the 
ous (linear) theory this parameter was & function of the standard 
tion of the logarithms of the number of cycles. In the present 
) theory it is a functioa of the skewness of the number of cycles. 
partial derivatives of Vs and Ms with respect to each of the 
| 


e 4 
o 


590 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


three statistics Vs, ss, У, з, obtained from (3.8) lead to the following 
relations: For constant values of the standard deviation and the skew- 
ness the estimates of the characteristic number of cycles V s and of the 


L4 
FARAMETER ©; 


3 8 


-A 


are 


3 


STANDARDIZED DISTANCES BI 


xponentiol case 


£ 


SKEWNESS Va, 


.0/ 10 20 30 40 50 60 70 8 90 /0 // 
PARAMETER [fs 


Fra. 5. Standardized distances as functions of parameter 1 /as. (See Table 1.) 


minimum life No,s increase proportionally to the mean Ws. For con- 
stant mean and skewness and increasing standard deviations the esti- 
mates of the characteristic number of éycles increase for 1/as<1 and 
decrease for 1/as>1 and the minimum life decreases. For constant 


ә 
„ 


MINIMUM LIFE IN FATIGUE А 591 


mean and standard deviation the estimates of the characteristic num- 
ber decrease and the minimum life increases with the skewness. For 
constant mean and skewness and increasing standard deviation the 


PARAMETER | 
Ol 05 4 Ps 5 1 


ЕЕ 


йс сш 
б шй 
ШШ ine aa 


20 
m id mni: 
ВАЕ 
= d 
deser. Ё DK ra 
[йы т 


Fic. 6. Estimation of sensitivity limit No,s. (See Table 19 


difference Ӯ з — o,s increases. For constant mean and standard devia- 
tion and increasing skewness the difference decreases. 

The equations (3.6) and (3.7) lead to a new criterion which deter- 
mines whether the minimum life No,s vanishes or not, since 


Kus = 0 if Ns + ss(N)(A(@s) — B@s) = 0. 
This condition may be written from (3.6’) and (3.7) * 
(4.7) Ros = 0; 052/75: = Г(1 + 2/@в)/Т%(1 + 1/9). 


© « 


592 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER | 


The minimum life is taken to be zero, if the equality is fulfilled within 
the errors of random sampling. The same assumption must be made 1 
the minimum life turns out to be negative. 

Finally the limiting number of cycles N,,s for which the probability — 
of survival is practically zero, is obtained from (2.9), (3.7) and (3, D 
as 


(4.8) Nus = Ns + es|A(os) + B(os)(e*5'e* — 1)]. 


It may be estimated by replacing the population values Vs and ag; —- 
by the sample values and using the functions A(os) and B(os) given in _ 
"Table 1. = 
If the computed minimum life is longer than the smallest number и 
of cycles at fracture, or if the largest observed number exceeds М.в», 
it must be taken into account in the evaluation of such contradictions — 
that the estimations of №, з and N,,s are subject to considerable errors 
of random sampling, which so far are unknown. This holds also for the — 
different criteria to be used to prove or disprove the existence of a 
non-vanishing minimum life. i 


5. FATIGUE IN NICKEL AND ALUMINUM 
Table 2 summarizes the observed number of cycles to fracture at - 


TABLE 2 
NICKEL,—REVERSED TORSION WIRE TESTS (RAVILLY [5]) 


Fatigue Life N, Thousands of Cycles 
Elastic Stresses S іп kg/mm.? 


Specimen Position р E 
Number т үг; 180 +215 $25.5 430 439 +37.5 £445 349 200 


т 1 

aH " 
1 — 0.054 1,040 44 28 1180) 8.3 584 40.0 23.0 175 110 
2 0.048 1,80 44 — 290 1200 87.2 632 43.9 265.0 20.2 118  — 
з 0.851 1,110 44 — 223 14.0 872 044 454 26.7 205 120 — 
4 080% 1,140 44 — 28 140 8.7 051 465 269 26 190 
5 0.7010 1,150 — 40 208 — 144.5 — 95.8 68.0 11 284 20.7 19.1 
6 — 0.7143 1160 450 — 242 17.0 059 69.5 48.0 М5 20.8 138 
7 — 0.007 1,170 459 — 244 — 148.0 90.5 60.7 48.2 28.0 20.8 1950 —— 
8 060 1,180 — 401 — 247 1510 98.5 70.5 445 29.2 20.0 19.8 — — 
9 0.574 1,180. 45 248 130 1014 71.0 49.3 08 21.0 140 . 
10 0.5238 1,20 47 — 258 1540 1024 72.0 49.9 29.9 211 41 — 
п 0.4762 1,210 — 480 — 253 — 167.0 1040 73.8 51.3 29.9 215 142 T 
12 0.4286 1,225 — 489 24 — 158.0 1051 74.0 51.6 30.0 21.9 149 
13 0.3810 1,240 492 255 150.0 105.9 75.1 52.3 30.3 221 М4 
14 0.3333 1,270 499 257 164.0 106.0 76.0 52.5 30.5 22.8 145 ү 
0.257 ,1,280 — 503 22 1650 160 76.0 53.7 31.0 23 1&0 — 
10 0.2381 1,280 0 204 160 105.2 77.5 515 318 22.7 ШТ 
17 0.1905 — 1,910 510 287 к 170.0 ^106.6 78.0 54.8 32.0 22.9 148 T 
18 0.1429 1,350 — 516 — 289 1720 108.0 78.0 ‚550 330 22.1 158 
19 0.0952 1,460: £534 202 173.0 1080 79.2 55.7 336 23.7 161 
20 0.047 1,620 558 , 294 — 19.0 108.2 80.0 57.0 347, 24.1 164 


593 


CE puv @ sepqu, eog) "uors10 рәвдәләл ш ƏM [oyoru Jo 6759} NFHS `Z “OTT 
(ООО/ NI N) 27025 DIWHLIe 907 


ос 0052 C0020 OOU OOP! QOZI 0001006 008 OAL 009 OOG 0:00» OSE ООС OSZ OCLOGI OM OPI OZI ОЮ OG OP OL 09 OSS? Or ck OF SZ 0290 Ф P ZI 


91-6 B= EZ EA OESS М SLE"5 GS, "S а 


6 


8 
ü 


$ 3 заза: vd 


MINIMUM LIFE IN FATIGUE 
== = т. 


~ 


.ZAVIHVA CS2002M. 


594 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 108 


different stress amplitudes for nickel. These data, taken from Ravilly's 
observations [5] are traced on logarithmic extremal probability paper 
in Figure 7 using the plotting positions (1.7). 

Table 3 gives the estimations of the three parameters. The estimates 
for the minimum life No,s and the characteristic number Vs diminish 
with increasing stress level as might be expected. The estimated values 
of the scale parameters 1/as' indicate that in most cases the mode ex- 


TABLE 3 
PARAMETERS FOR NICKEL WIRE 


Non-Linear Theory 
Stress Level : 

(kg. per mm) 8 +15.5 +18.0 +21.5 +25.5 
Number of Spec. л 20 20 20 20 
Mean Ns 1,232.75 479.75 252.45 154.025 
Stand. Dev. as 132.7193 34.7046 24.4228 17.0979 
Third Moment TS 2,861,621. 1,422.2 2,524.58 —1,712.34 
Scale Parameler Vas! 0.31438 0.16791 0.14554 0.07604 
Char. Number Ys 1,249.93 490.29 260.48 161.44 
Sensitivity Limit Nos 1,051.63 396.13 185.56 10.24 

Linear Theory 
Stress Level 

(kg. per mm.) 8 £25.5 +30 i33 +315 +45 i49 +56 
Number of Spec. n о, 2 20 20 20 20 20 
Mean Logarithm ИТУ 5.18664 4.00871 4.85573 4.60050 4.47015 4.33224 4.14408 
Geom, Std. Dev. a(log N) 0.04850 0.03490 0.03594 0.03813 0.04118 0.02006 0.04112 
Scale Parameter СТЯ 0.04564 0.03284 0.03382 0.08588 0.03875 0.02819 0.08800 
Log. Char. Number ор Vs 5.21001 5.01558 4.87309 4.71793 4.49005 4.94072 4.16455 
Char, Number Vs 162.20 108.05 74.00 — 52.29 30.91 22.22 14.0 


ceeds the median, a fact that contradicts the usual assumptions con- 
cerning the vut. of the distribution functions in fatigue. Within the 
linear theory, the estimates of 1/o; do not show any systematic de- 
pendence upon S and the hypothesis that their variation is due to 
chance seems admissible on the basis of the considered tests. 

Figure 7 shows that for nickel tested in reversed torsion the linear 
theory gives an excellent fit for stress levels equal to or exceeding 25.5 
kg. mm.-?, for which, therefore, no minimum life appears to exist. For 
stress levels below 25.5 kg. mm.—*, however, the observations can be 
better fitted by the three-parameter survivorship functions which is 
bending upwards for decreasing numbers of cycles, indicating the exist- 
ence of a minimum life. For the sake of comparison the theoretical 
survivorship functions for the linear, two-parameter theory (1.1) and 
for the general, three-parameter thecry? (2.1) are both shown for the 
stress level 25.5 kg. mm.-?. ae 


, » 


UM LIFE IN FATIGUE 595 


is interesting to note in Table 3 that the two estimates for Vises 
practically equal. The observed and the two theoretical survivor- 
functions in this case are also shown in Figure 8 in the conventional 
where the number of cycles is traced as abscissa and the survivor- 
unction as ordinate, both in linear scales. Again, no preference 
be given to either theory. 


100 120 /40 160 180 200 


y 
1 


Reduced Variate 


:—, Observations 

—— Linear Theory — 

----General Theory 
4 


110 130 /50 М, Vs 170 
Number of Cycles in 1000 


Fia. 8. Survivorship function fér‘nickel at S= +25.5 kg. mm, 
^ (See Tables 2 and 3.) 
ble 4 shows the calculation of the three parameters for Ravilly’s 


servations on aluminum wire for the three lowest stress levels. Figure 


resents the data given in [2] together with the theoretical curves 
ip function for the stress level 


tained from Table 4. The survivorsh 

5 kg. mm. clearly shows how the minimum life converges toward 

with inereasing level of applied stress; if the minimum life is 

ficiently low and the stress level sufficiently high the general (three- 
meter) theory hardly differs from the linear one, since with in- 
g stress levels the minimum life approaches zero во rapidly that 

distinction between the two theories is possible. 


E D 


596 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


TABLE 4 
THE THREE PARAMETERS FOR ALUMINUM WIRE [2] 


Stress Level 

@ егш) 8 £5.25 35.5 £5.75 
Number of Spec. n 20 20 20 
Mean Ws 1,140.50 552.15 217.30 
Stand. Dev. ав 179.867 177.1485 57.4228 
‘Third Moment mS 2,672,050. 1,088,203. ` —28,247.78 
Scale Parameter l/ag 0.18952 0.14888 0.10240 
Char. Number Ts 1,190.41 609.76 238.79 
Sensitivity Limit Nos. 751.47 76.75 1.39 


Tables 3 and 4 indicate that the 1/&s decrease with increasing stress 
levels for those stresses where the minimum life does not vanish. While 
similar relations have been reported by some investigators, other ob- 
servations show no such variation. The above relation cannot, therefore, 
be accepted as well established. Since all estimated values of 1/as 

' are considerably below unity, the existence of an exponential distribu- 
tion of fatigue failures appears unlikely, for the reasons given above. 


9995 


af 999 ы 
3 998 3 | I- 


4 
Н 


S3: 8 
| 


REDUCED VARIATE 


PROBABILITY OF SURVIVAL 


PA S575 5-55 5-5.25] 
50| 
о СС П р ИКЕ LE N-EII- СЕ 
50 i Ge sat est i id БЕА Й 
20) | 
4 /0| 
o 
2E o | 
0004| 
00001 


8910 2 14 161820 25 30 35404550 60 70 80 9000 /20 MO (60180200 250 500 
LocARITHMIC Scare (N ın 1000) 


э 
Ета. 9. Fatigue tests of annealed alumirym wire in reversed torsion. 
(Ste Table 4) - EO 


MINIMUM LIFE IN FATIGUE Rt Ў 597. sat 
CONCLUSIONS 


The existence of an “incubation period” of fatigue, that is of a finite 
threshold number of cycles No,s below which, at a given stress level, ч 
fatigue failure will not occur and at which the probability of surviva 
is therefore equal to unity has been established for certain metals 
stress amplitudes. This phenomenon has also been confirmed by fatigue 
studies on both hard.and mild structural steel at stress levels near and _ 
below their static yield stress. Therefore a “sensitivity limit” in terms — 
of cycles (minimum life) appears to be as real an aspect of fatigue as 
the sensitivity limit in terms of stress (endurance limit), the existence 
of which is quite generally recognized. m 

The present investigation is part of a research project on the basic ^ . 
aspects of fatigue, conducted at the Civil Engineering Research Labora- 


of Ordnance Research. 


REFERENCES 


[1] Freudenthal, A. M., “Planning and Interpretations of Fatigue Tests," Ameri- 

can Society for Testing Materials, Special Technical Publication. No. 121 (1953). 

[2] Freudenthal, A. M., and Gumbel, E. J., "Statistical Interpretation of Fatigue. 
Tests,” Proceedings of the Royal Society A., 216 (1953), 309-32. et 

[3] National Bureau of Standards, «Probability Tables for the Analysis of Ex- 
treme Value Data,” Applied Mi athematics Series 22, Washington (1953). 

[4] Neyman, J. and Pearson, E. B. «On the Use and Interptetations of Certain 
Test Criteria,” Biometrika, 20 (1928), 175-240. ` 

[5] Ravilly, E., “Contributions à l'étude de la rupture des fils métalliques," 
soumis à des torsions alternées,” Publications Scientifiques el Techniques du 
Ministère de l' Air. No. 120, Paris (1938), 52-10. 

[6] Weibull, W., “The Phenomena of Rupture in Solids,” 
Vetenskaps Akademien No. 153, Stockholre (1939). : RUE 

[7] Weibull, W., “A Statistical Representation of Fatigue Failure in Solids, 
Trans. Royal Institute of Technology. No. 27, Stockholm (1949). 


Handlingar Ingeniórs 


POINT ESTIMATES OF ORDINATES OF 
CONCAVE FUNCTIONS 


CLIFFORD HinpgETH* 
North Carolina State College 

A method is developed for obtaining maximum likelihood 
estimates of points on a surface of unspecified algebraic form 
when ordinates of the points are required to satisfy a set of 
linear inequalities. A production function with one variable 
input is considered in some detail. {n this case the restrictions 
follow from the assumption of non-increasing returns. An illus- 
trative computation is worked out using a procedure based on 
equivalence between the estimation problem and a certain 
saddle point problem. Alternative procedures for production 
functions with two variable inputs are sketched. 


1. INTRODUCTION 


conomists are frequently in the position of having fairly strong 

presumptions that relations among variables with which they deal 
satisfy certain qualitative restrictions, but they seldom have very good 
grounds for saying that a particular algebraic form is appropriate for 
representing a given relation. Diminishing marginal productivity of 
inputs in production relations, downward slope of demand relations and 
homogeneity of certain demand and production relations are examples 
of properties which economists often assume. 

Unfortunately, statistical procedures available to economists typi- 
cally require that they completely ignore many of their a priori pre- 
sumptions in performing their statistical analyses or, alternatively, that 
they rather arbitrarily assume that a particular algebraic form satis- 
factorily represents the relation being investigated. In this paper pro- 
cedures are suggested for estimating points on a production surface of 
unspecified algebraic form from datà on outputs produced by various 
combinations of inputs when the inputs are subject to diminishing re- 
turns. An example involving a single variable input is worked out as an 
illustration. More generally one could apply the approach presented 
here to a variety of situations in which an investigator knows some 
properties of a relation being studied but does not have sufficient infor- 
mation to put the relation into any simple parametric form. Such situa- 
tions are not uncommon so applications of the general approach may 
arise in several fields. However, the author’s closer fatniliarity with 


* The author i@indebted to staff members of the Cowles Commission and to L. J. Savage of the 
Committee on Statistics, University of Chicago, for suggeions and critical comments. This paper will 
be reprinted as Journal Article No. 551 of tfe North Carolina Agricultural Experiment Station. 

D 


598 


n 2 


POINT ESTIMATES OF ORDINATES OF CONCAVE FUNCTIONS 599 


problems from economies makes it convenient to refer to this field 
when а more specific context is wanted. 

The procedures to be outlined should be regarded as supplementing 
rather than supplanting existing techniques. When an investigator is 
reasonably sure that a particular parametrie form will satisfactorily 
represent the relevant properties of his relation, there will ordinarily be 
advantages of both efficiency and convenience in using the form. Meth- 
ods have also been developed for investigating particular aspects ofa 
relation with only mild assumptions as to its form. Davis [4, Ch. 6] 
gives examples of smoothing devices that depend on local properties of 
a function. Robbins and Monro [13] have presented a stochastic method 
for approximating the value of à control variable associated with a 
selected mean response. 

Procedures for estimating the locus of an extreme value of a function 
and for approximating its properties in that neighborhood have been 
suggested by Hotelling [7] and extended and refined by Friedman and 
Savage [5] and Box and Wilson [2].1 These seem particularly useful if 
the investigator is interested in a fairly small region about the extreme 
value and if he has either a pretty good a priori notion of where to find 
the extremum or has the opportunity to draw а fairly large sequential 
sample. [f some estimates of value of the function over a rather large 
region are needed, if fixed samples that cannot easily be repeated are 
the main source of information, and/or if there exists appropriate а 
priori information to be taken into account, then procedures like those 
suggested in what follows may have advantages. There is nothing to 
prevent an investigator from combining certain features of the present 
analysis with suggestions of Hotelling and the others if some opportu- 
nity exists to analyse certain initial data and then plan for the collection 
of additional observations. А ү sy 

The reader will have a better bgsis for judging possible applicatione 
after the illustrative production analysis has been presented. Statistical 
problems arising from the generation of variables by simultaneous 
economic relations are not considered in the present discussion nor are 
types of statistical inference other than point estimation. However, 
conventional tests of,hypotheses could be used in many of the contem- 
plated situations. 


2, PRODUCTION FUNCTIONS WITH ONE VARIABLE INPUT 


Consider a production relation of the form 
(2.1) у= Ф(2) Tu 


1 A brief review of thesd procedures has recently been published by Anderson [1] 


e 


е 


600 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


where y represents output, z represents variable input and и is a random 
disturbance. In general z may be a vector with as many components as 
there are types of variable input but for the present we assume that all 
_ inputs except one are held constant, or as nearly constant as physical 
conditions permit. If inputs are defined broadly enough to include all 
factors influencing output, then variations in u may be attributed to 
unavoidable and unobserved variations in some of the constant inputs. 
Tt is assumed that variations in и approximate independent drawings 
from a normal distribution with zero mean and finite variance. 

An investigator is considered to have observations on y and z for N 
selected values of z. Let the values of z be arranged in increasing order 
and denoted by 21, 22, * * ~, Zm * * * , zw. For each level of input there 
may be several trials and corresponding observations of output. Let 
Т. be the number of trials at level of input z, and let y, be the observed 
output for the (th trial at this level. We have 


(2.2) Ум = Eln) + unt n=1,2,---,N 
$—1,2,---,; De 


An economist with such data is principally interested in the possi- 
bility of drawing inferences about which levels of input are most 
profitable for various combinations of prices of output and of variable 
input (or conditions of demand for output and supply of input). Fre- 
quently such inferences have been drawn by assuming that the function 
$(2) can be appróximated by some given algebraic form with several 
unknown parameters to be estimated from the data. Estimates of the 
parameters are inserted into the form to obtain an estimated relation 
and this estimated relation is then used to calculate most profitable 
levels of input for chosen combinations of prices. 

The chief difficulty with this procedure is that the inferences often 
depend critically upon the algebraioform chosen. It is not uncommon 
to find that alternative forms fit the data almost equally well but have 
very different implications for the most profitable level of input. One 
example of this may be found in a study by Paul R. Johnson [8] which 
will be utilized further in Section 3. A recent article by Prais [12] 
emphasizes the critical importance of the form of equation chosen to 
represent a demand relation. This problem would arise more frequently 
in economic literature if it were given attention commensurate with its 
importance. Many economists uncritically accept the appropriateness 
of a functional form chosen largely on the basis of convention or con- 
venience; othérs try several forms for thejr relations but report only the 
one that in some sense “looks best" a posteriori, > 


. а 


POINT ESTIMATES OF ORDINATES OF CONCAVE Functions 601 


If the investigator knew the prices of y and z (or more generally if 
he knew the relevant demand relation for y and supply relation for 2) 
and if these were expected to remain fixed during the period that his 
statistical results were to be used, then he could express net revenue as 
a function of г and regard this as the relation to be studied. If, in addi- 
tion, he could proceed to draw new observations of y, and therefore of 
net revenue, for chosen values of 2, then he could apply the Hotelling 
(Friédman-Savage or Box-Wilson techniques could be used if z were а 
vector) procedure for estimating the point of maximum net revenue. 
If he wished the analysis to apply to various price (or demand and sup- 
ply) situations or if he needed to draw some inferences before obtaining | 
new observations, then the above techniques would be difficult to apply. 

To compare all levels of input the researcher would, of course, have — 
to make a very specific assumption about the form of ¢(z). However, 
in many studies this is not really necessary. Tf a reasonable basis can 
be found for comparing the profitabilities of levels of input for which 
data exist, such comparisons will often determine the optimal level of 
input to as close an approximation as the available data permit. The 
results are typically intended for use as a guide to actual producers. 
Conditions faced by these producers can never be exactly duplicated in 
experiments so there is always some error in transferring experimental 
results to commercial situations. If the data are gathered by surveying 
actual producing units, fundamentally the same problem will exist. 
"There will always be some more or less relevant discrepancies between 
conditions faced by sample producers at the times they are surveyed 
and conditions faced by producers who ultimately use the results at the 
time their applications are made. Thus complete accuracy in the deter- 
mination of optimal input for the conditions represented by the data 
would always be superfluous even if it avere possible, Furthermore, if 
comparisons among the observed Jevels of input are too crude when 
these considerations are taken into account, it is ordinarily possible to. 
supplement the data and to obtain observations for additional levels 
in what is expected to be the relevant region. Indeed, it may frequently 
happen that an indication of the kinds of new observations that will 
prove useful may be ane of the most valuable results of an initial exami- 
nation of production data. É 

Let т, be the expected value of output at input level Zn. 

(23) ; m = 98) aho. 
We seek to construct a reasonable procedure for obtaining estimates of 
the yn; these can then be translated inte estimates of expected profita- 


* 


' 602 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER зом 


bility of the z, for any given price combinations. Our estimates will be 
derived by the method of maximum likelihood. However the same re- 
sults could be obtained by least squares and possibly other methods. 
Of course, if there were no a priori restrictions on ¢, then the maximum 
likelihood estimates of the ordinates nn would just be the means of ob- 
served outputs for the appropriate levels of input, i.e., 


1 Z 
(2.4) in = Jn = — 2 Ynt 


TERA 


where the 7, might be called limited information maximum likelihood 
estimates.? To obtain full maximum likelihood estimates (here denoted 
by 7n) the investigator must maximize the likelihood function subject 
to all of the a priori restrictions he feels justified in imposing. 

As was indicated earlier, in many production situations the investi- 
gator will feel that inputs are subject to decreasing returns. This is 
equivalent to assuming that $ is concave and yields the following re- 
strietions on the ordinate'— 


pe Ex 
(2.5) Nn Nn > Nn? — "nsi 
ntl — Zn Zn42 — Zn41 


For such cases it is desired, to maximize the likelihood function subject 
to (2.5). The logarithm of the likelihood function is given by 


> 1 XN т» 
(2.0) — L(m c?) = — T/2log 2ro? — — Y. Y, (Yni — m)? 
20° not а 
where T=} h T, and у= (тт - - - m). Since o? is not restricted its 
estimator can be obtained by differentiation, yieldingt— 
1°X 7. 


(2.7) ё = — 5»X (ум — Fn)? 

T n=l tel 
where the fj, are those values which maximize L and satisfy (2.5); thus 
they minimize the double sum on the right of (2.6) under the restric- 
tions (2.5). We may write 


Г 


? For an analogous use of this term in another context see Koopmans and Hood [9]. 

* The investigator may typically feel justified in assuming strict concavity in which case the strict 
inequalities would hold in (2.5). However, if these restrictions are imposed, the likelihood function 
may not have а maximum but only a least upper bound. Maximizing the likeliHbod function subject 
to the restrictions given is equivalent to finding the least upper bound in the region defined by corre- 
sponding strict inequalities. 

4 This estimator may be expected to have a substagt/3l downward bias when the Т, are small. Ite 
distribution has not yet been investigated. * з 


à 


POINT ESTIMATES OF ORDINATES OF CONCAVE FUNCTIONS 603 
N Tn N Tn N S i 
(28) 2225 Uni =m)? = 22 22 Unt — In)? + 22 Talin — a)’. 
n=l t=l n-l tel nel 


The terms of the first summation over т on the right do not depend on 
папа may be ignored. Let z be an N-dimensional vector with elements 


(2.9) In = Т — In n=1,2, +; N. 


The problem is then to minimize the weighted (by the T,) sum of 
squares of the z, subject to the requirements that the 7, satisfy (2.5). 
Equivalent restrictions on the 2» are given by 


1 1 1 1 qs 
= Жы}. +F Jama y ^ Tapa — dn 


(2.10) An+t = ro x Ан 
ar ( + )» 1— —— Ine 20 
Ala SS A T 
where A,42—2442—2n41, Anta =2n+1—2ny and n=1, 2, = +, №2. 


It will simplify the discussion to state the problem in matrix notation 
(2.11) Find an eA such that 2D£' € zDz' for all zeA where* 
(i) А is the set of all vectors = that satisfy A 2/0020 and 


a (i) ab G8 as 10 0 0 
As ХА Аз Ay 
—1 ipea =1 
А т Ea ENT RN 0 0 
(i) A= As (сос) м t 
^ -1 1 1 ) mil 
0 0 0 ON eae Ay-1 (сол An 
(iii) b'=4A7 
Ti е 
(v) D= Tz 0 e* 
0 5 
Ty 


Tn the special case in which the input levels are equally spaced and the same number of trials 


exist for each level we may take D =I and 
e 
e E dS Cos d 


о: ар R AA ee WO, 
A= р 


оо 20/0 0 GI Мх 


tive constant does not change the problem. In this 


since multiplicati f A by а posi 
plication ot: D5 ans тоа ааа the second differences of the elements 


case, multiplication of a column vector by A riejds the negative of 
of the vector. е e 


604 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


The investigator can readily obtain D, A, b from his input-output 
data. We seek a way to compute 2 and thereby 7. Iterative procedures 
have generally been found most useful for this kind of computation. 
One might, for example, adapt a gradient method of minimization® to 
this problem or one might choose an arbitrary x satisfying the restric- 
tions and proceed to minimize the form zDz' with respect to each com- 
ponent of z in turn holding the other components constant and observ- 
ing the relevant restrictions at each stage. It is argued in Section 4 that 
the latter process would, in fact, converge to the minimizing vector. 

However, for the particular problem stated in (2.11) it is possible to 
develop a more economical approach by using the fact that problems 
of extremization subject to inequalities have commonly been found to 
be equivalent to saddle point problems similar to those encountered in 
the Theory of Games. By a Theorem of Kuhn and Tucker [11, pp. 487, 
491-2] the minimization problem stated in (2.11) is equivalent to the 
following saddle point problem— 

(2.12) Find vectors #, 9 such that 


9($, v) S $(4, 9) S olz, 0) 
for all т and for all ›> O where 
(2.18) olz, v) = zDz' — v(Az' + b’). 


Some of the general methods that have been developed for minimax 
problems could deubtless be adapted to this case. However, the follow- 
lowing procedure seems exceedingly simple and is used in the example 
of Section 3. 


Since D is positive definite; ф(х, v) has, for any v, a unique minimum 
with respect to z which may be found by differentiation. 


25 15 
(2.14) i = 2Dx&— A'v'. 
Ox 


Setting the derivatives equal to zero yields 
(2.15) 2 = 4D-A'y/ 
and substituting into (2.13), " 
(2.16) min $(z, v) = ф*(0) = — WADA’ — yb’. 
z 
To find the non-negative № —2 dimensional vector > that maximizes 


this expressióh it is convenient to consider an equivalent minimum 
A 
* See, for example, Chernoff and Divinsky [3, pp. 246-7]. 0 


„ 
^ 


—- MENS P EE a ААА 


| 


POINT ESTIMATES OF ORDINATES OF CONCAYE FUNCTIONS . 605 
problem. Let à 

(2.17) C = ADA’ and | и 
(2.18) olw) = — 29*(») = 3000 + 20d’. ў 


Clearly ? minimizes 0(0). Since A has N —2 linearly independent rows; | 


it may be noted that C is positive definite. The procedure to be fol- 
lowed in finding ? is an iterative one. An initial value, say 009 = (0109020 
+++ py_2), is chosen as a starting point for the iteration. Holding all 


components of v except’ the first fixed at their values given by v(9, the — 


non-negative value of vı which minimizes 0(0) is found. Call this value 
210. 0(0) is then minimized with respect to admissible values of the sec- 


ond coordinate holding v; fixed at vi(? and vs to vy fixed at v3 to vya. _ 


This process of minimizing with respect to each coordinate in turn 
while holding others fixed at their last obtained values is continued 
until the desired degree of stability’ is obtained. 

Let v, k=1, 2, +++, K, by a given component of v where, of course, 
К= N —2. The procedure indicated above can be made more explicit 
by observing that the minimum of 6(v) with respect to any v, is either 
attained where v, —0 or where 98/50, =0. If the latter equation yields a 
non-negative value for 0, then this is the minimizing value, otherwise 
= 0 is the minimizing value. We note 


(2.19) а? = Cot + О, 
àv 


At the pth stage of the iteration, we define wy? as the value of the 
kth coordinate that would be obtained by setting 00/dv,=0, i.e., 


k-i K p b, 
Cki Ch k 
(2.20) w m — > EC Naat Ss о-о — 2— 
i21 бы i=k+l б Chk 

‹ 


ё 
where the сь: are elements of C and = 1, 2; * * * » К. і н 
The value of the kth coordinate of v at the pth stage of the iteration 


is then obtained by taking 
(2.21) 0,9) = max (0:0), 0). 


For the production problem described earlier it is convenient to start 
the process by setting v? = 0. The process then generates a sequence of 
vectors which. we denote by {v™}, ie; 


many iterative procéases one intuitively 


7 It is reall i that is desired. In 
eally а certain level of accuracy that is while however to investigate 


associates this with the observed degree ofratebility. It would be 
circumstances under which Ез two may differ. x 


606 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 
00 = (4 0 0...0) A 
v® = (0 0 0...0) 


js vO = (nO 050 gy e rok?) 
(2.22) (К+) = (010) 00 D... yk Q) 
VOE) = (ру)... pu VD. guam ee yg 
etc. 


The function to be minimized, 6(v), then defines a corresponding se- 
quence of scalars which we indicate by foom) }. In Section 5 it is shown 
that the sequence forum) } converges to a unique minimum. It will be ` 
seen that the proof depends essentially on the existence of a unique — 
minimum, the boundedness of {v™ }, and the continuity of 0(0) and its — 
first derivatives, This proof is deferred because some readers may wish — 
to see the process illustrated and discussed before becoming involved in | 
the mathematical details of the proof. Once ? has been obtained, 2 may 
be found from (2.15) and the estimates of the ordinates, the fn, follow — 
from (2.9). The distribution of these estimates has not as yet been 
investigated. In Section 3, the computing procedure just described is 
applied to illustrative production data. 


3. AN ILLUSTRATIVE COMPUTATION 


The primary purpose of the illustration is to show how the computing 
procedure developed in the previous section can be applied. While data 
from actual experiments are ‘used, I have not examined the original 
reports of these experiments and tlanot have any firm judgment about и 
the appropriateness of combining data from these various experiments — 
in the simple analysis proposed here. For this reason I do not try to 
discuss the economic implications of the data but merely use them to 
illustrate a computing procedure. Ў 

The data are taken from corn fertility expetiments conducted at 
North Carolina State College. These have been summarized in a bulle- 
tin by Krantz [10]. Corn yields that resulted from various applications 
of nitrogen fertilizer are available. Paul R. Johnson [8] has used the 
results for fitting production functions under several alternative as- —.—- 
sumptions about the algebraic form of the function. Prior to his | 
analysis Johnson made an aftempt to select experiments that would 


POINT ESTIMATES OF ORDINATES OF CONCAVE FUNCTIONS 607 


provide observations of yields under fairly similar conditions in all re- 
spects except level of nitrogen applied. Only results from plots with 
closely related soil types and from years of “good” weather were used.* 
Johnson’s data consisted partly of direct observations and partly of 
interpolated values. Only the direct observations are used in the present 
calculation. 


TABLE I 


FITTED EQUATIONS AND OPTIMAL INPUTS FROM 
© JOHNSON STUDY 


Nitrogen Application to 


Fitted Equation Maximize Profit 


y —4.504 (z4-20)-59 5340 
y =25.16+.7595 2— .00209 z? 164 
у —108—82.48 (.9897)* 230 


y represents yield in bushels, z represents nitrogen in lbs. 


In addition to approximating the underlying functional relation, 
Johnson was interested in the optimal application of nitrogen when 
nitrogen costs $0.137 per Ib. and corn sells for $1.75 per bushel. While 
all of his equations fitted the data reasonably well, they differed sub- 


* Good weather is attributed by Johnson to years in which “the rainfall distribution was about 
normal and soil moisture conditions were not low enough to cause the leaves to roll during this period 
(a five-week critical period including the time of tasseling).” In principle one should take account of 
the weather effect in the statistical specification if it is believed to have significantly affected the ob- 
served yields. If this were done and the weather and input effects were assumed to be additive, (2.2) 
would be replaced by 


Q2» Ynmt = зп + bm + Unmt 


where m=1, 2, +++, M is an index of the year of a partigular observation and ôm is the weather effect 
in that year, Let Tnn be the number of observations at input level zn in year m. If the Tnm are equal for 
alln and m, then the weather effect causes no significent complication. If we impose the natural require- 
ment that Zmém=0, we have 
E Eymt È È Хут 
à сыты ы SEA 
: Za X27 
К 


from m=1, 2, + ++, M. The да can be obtained by minimising 


E Einm)’ 
i) Ep Es 
8 x тп > Tam 


subject to the restrictions in (2.5) and the procedure developed in Section 2 applies directly. If the Tam 
Эге unequal, as in the present case, the situation is more complicated. It seemed undesirable to in- 
troduce these complications in the presentzillustration especially since an effort had previously been 
made to select homogeneous:years. А 


608 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


stantially in their implications for the most profitable level of nitrogen, 
This is shown in Table I where the first column shows the relations 
with estimated parameters filled in and the second column indicates 
the corresponding level of nitrogen for highest profit. 

Unless one has considerable a priori confidence in the appropriate- 
ness of a particular algebraic form, there is considerable uncertainty 
about the implication of the data for the decision.as to level of input. 
To illustrate the alternative procedure developed in Section 2, the ob- 
servations are first listed in Table II. ° 


TABLE II 
OBSERVED YIELDS AT SPECIFIED LEVELS OF NITROGEN 


NITROGEN 
(bs. /acre) 0 20 40 60 80 120 160 180 
(22) 
9.9 | 43.4 | 44.9 | 52.2 | 79.0] 72.0 | 81.5| 74.7 
81.3 | 27.3 | 40.2 | 66.0 | 68.6 | 74.1 | 72.9 | 110.3 
32.0 | 35.3 | 96.9 | 74.0 | 59.8 | 78.8 | 117.1 | 102.7 
24.2 | 42.2 | 52.1 | 64.3 | 81.7 | 107.0 | 102.3 | 120.9 
Ү1вірв 18.8 | 35.7 | 85.1 | 77.3 | 107.1 | 102.5 | 114.3 | 103.9 
(Bu./Aere) | 25.0 | 50.1 | 63.6 | 34.0 | 48.5 | 68.7 | 70.2 | 98.2 
(ум) 2.8 | 56.0 | 77.3 | 58.5 | 94.6 | 78.1| 83.9 | 70.7 
17.4 | 42.1 | 63.6 | 49.5 | 101.8 | 12.4 | 104.9 | 70.7 
14.6 | 42.1 62.2 | 94.6 | 74.0 | 113.9 
25.8 50.1 69.8 | 104.9 
13.8 85.0 
33.0 80.8 
19.8 73.0 
25.0 100.5 
63.9 i 115.5 
24.4 92.1 
22.8 * s 83.9 
51.7 96.3 
11.6 96.3 
14.4 
à 20.8 
19.1 
7.2 
16.6 
55.2 
8.7 
15.2 L 
Sum 619.5 |374.2 |523.7 588.1.) 2735.7 |1560.8 | 965.9 | 752.1 
No. of Obs. D ni 
(Ta) 27 9 8 10 9 i9 10 8 
Mean (9a) | 22.94| 41.58| 65.46] 58.81| 81.74| 82.15| 96.59] 94.01 


pam — A—5É———————— 


POINT ESTIMATES OF ORDINATES OF CONCAVE FUNCTIONS 609 


From these data, the following are readily computed. 


1 
I ODER 
Tı 
1 
Maea Se) 
D = Ts 
pesa 
ооа 
Ту 


(8.1) 
озо о О ООО ОИ 
o ао НЕО е 
o 00941980 10 ОИ ОВО 
ieee ну kya шш 
7| 9. 2:9): ООо ое 
6 оО ООО 
ооо 0 .100 0 
QUESO EDAD 0. 0 .1250 


| 
с © © о© н о о о 
| 
= 
Lj 
| 
= 
© 
© 


(3.2) A 


0 0° 2022070 = b 73:2 
The meanings of the restrictions are not changed if any row of Aids. 
multiplied by any positive constant. This sometimes makes it possible 
to choose convenient numbers for elements. The A above differs from 
that defined in (2.11) in that the above has been obtained by multi- 
plying the first three rows by 20 and the last three rows by 40. Con- 
tinuing T 

— 5.24 


30.53 
—99.58 
9.9. ‘= ^ De 
pn тоа 
—14.08| ~ 


19.60, 


€ 


610 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


С=Ар-14' 
.6064 —.4722 .1250 0 0 0 
—.4722  .7111 —.4500 .2000 0 0 
(3.4) .1250 —.4500  .6361 —.7333 .1111 0 


From these, 


w = 


We?) 
wP = 


0 .2000 —.7333 1.4525 —.4385  .0526| 
0 0 -1111 —.4385  .4215 —.4052 
Oo 0 .0526: —.4052 1.4526 


formulas for the шь?) are readily obtained. 


T1870, 0-9 — ,20610°-0 + 17.28 
-66405,0? + .6328»,0-0 — .281200-9 — 85.86 
— .19655 0 + .707400) + 1.1527», 0-0 


— .1746v5-) + 93.00 


(3.5) 


wm?) = 


— .1877020) + .504805) + .3019vs -D 


— .03620,0-» — 62.58 


w = 


WP = 


— .2636v® + 1.040304) + .96130%-0 + 66.57 
— .03620,0? + .2789050) — 26.99. 


From (3.5) the vglues of v,(? shown in Table III resulted when v® was 


taken equal to 
(2.21) 


the zero vector. It is recalled that 


vP) = max (0,0), 0). 


TABLE Ш 
SUCCESSIVE VALUES OF n% 


Nie ae ca 2 3 4 5 
Mm 17.28 0 0 0 0 
2 Dr 0 0 
3 89.60 85.50 85.31 85.30 85.30 
4 0 0 0 0 0 
5 42.95 44.03 44.08 | 44.08 44.08 
6 0 0 0 0 0 


Applying (2.15) we have y es 


POINT ESTIMATES OF ORDINATES OF CONCAVE FUNCTIONS 611 
0 
0 
—5.33 
8.53 
—1.19 
2.32 
—2.20 
0 


(3.6) 2! = 34DA = 


and recalling (2.9), the maximum likelihood estimates of the ordinates 
are given by 
22.94 
41.58 
60.13 
67.34 
74.55 | 
84.47 
94.39 
94.01 


These are shown together with the original observations and their 
means in Figure 1. 

While a comparison of profitability of various applications is sub- 
ject to the qualifications mentioned at the beginning of the section, it 
may be worthwhile to note that, at prices of $1.75 for corn and .137 for 
nitrogen, 160 lbs. would be better than the other levels according to our 
estimates. To get a good determination of optimal input for about this 
price ratio there should clearly be more observations in the 120-220 
lbs. interval, they should be more closely spaced, and weather effects ` 
should be taken into account (see fn. 9). Since economists are usually 
interested in the gptimal production practices corresponding to a 
number of possible price situations, a useful interpretation of the results 
could be obtained by determining for each observed level of input those 
price combinations at which a particular level of input is most profit- 
able. Such a treatment has been illustrated for hypothetical production 
data by Hildreth and Reiter [6]. One consequence of estimating points 
on a surface, instead of estimating parameters in an equation assumed 


(3.7) v=r+y= 


. 612 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTE 


to represent the surface, is that interpolation or extrapolation ti 
of the surface for which no observations are available depends dir 
on judgment rather than on an initial assumption about the algebr 
form. In many contexts this should probably be counted an advanta 
of form-free estimation. Experienced persons who may have u 


Lbs. of Nitrogen 


Fig. 1. Observations and Estimates. 
— Observed yield (ум) 
X Mean yield (Jn) 
O Maximum likelihood estimate of expected yield (74) 


$ REN 
form. In addition, the form-free procedure allows necessary interpol: | 
tions or extrapolations to be made at the stage of applying the results. 
At this stage various adjustments for possible. discrepancies between 
experimental and commercial conditions need to be considered and 
advice of persons familiar with the circumstances of a particular appli- 
cation is likely to be available. 


6. PRODUCTION FUNCTIONS WITH TWO VARIABLE INPUTS 
d 


Procedures similar to the one just illustrated could be applied 
other problenfs of estimating an unknown coordinate of points om 
S mn К 


a 
5 


POINT ESTIMATES OF ORDINATES OF CONCAVE FUNCTIONS 613 


surface about which the investigator has qualitative a priort informa- 
tion, provided the a priori information could be translated into a set 
of linear inequalities (equalities would produce no complication and 
may be regarded asa special case) restricting the values of the unknown 
coordinate. It is likely that complications will sometimes arise in the 
translation of qualitative information into restrictions on the likelihood 
function. It may also be expected that different computing techniques 
will be required in some cases. | 

The proof of convergence of the iterative process used in Section 3 
is seen in Section 5 to depend essentially on the existence of a unique 
minimum, the boundedness of (0°? } and the continuity of 6(v) and its 
first derivatives. These conditions would all be met if the iteration were 
carried out on the vector x of the original problem (2.11) instead of 
the vector v of the dual problem. We could choose a vector 200 = (2109 
2 - . - zy) to start the iteration, find an element 2,“ which minimizes 
zDz' subject to Az'--b/Z0 with zə - + - zw held fixed at their initial 
values, then minimize zDz' with respect to 2», etc. 'This would have 
been more cumbersome (three z, would appear in each restriction and, 
except at the ends, three restrictions would have to be examined each 
time an z, were altered) than working with the dual problem in the 
case considered. However, in some problems it may bé useful to con- 
sider an iteration on the original variables that enter the likelihood 
function. 

For example, consider the case of a production function with two 
variable inputs, say 


(4.1) Ymnt = Ү(8т Zn) + Umne Where 


8m —the m** level of one input 

Zn —the ntt level of the other input Á 4 

Ymnr—observed output of the t observation with input levels Sm- 
ө 


and Zn 
илле value taken by the random disturbance on the ¢ trial 


with input levels s, and Zn, 
and 
те МАНЕ e Tue 


To simplify the discussion we assuine that the Tmn are all equal (the 
case of unequal numbers of trials could be covered by substituting 
Т"? for Emn? in (4.9) below the modifying subsequent equations 
accordingly). Let | f 


c 


614 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


(4.2) "ma = (8m, Zn) and 

(4.3) Жап = Nmn — ma Where 
T Tee 

4.4 Jmn = TEAM 

\ j Tmn tci Мен 


If we assume diminishing returns to each input, then the maximum 
likelihood estimates of the nmn are obtained by minimizing the sum of 
squares of the £m» subject to the restrictions imposed by diminishing 
returns. Let X denote the M rowed, N columiied matrix with typical 
element х„„ and let Y be the MXN matrix with typical element Jmn: 
The restrictions expressing diminishing returns to the first input are 
given by 


(4.5) A(X + Y) 20 where 


A is of order (M —2) ХМ and is analogous to the A of Section 2; its 
elements are given by 


а; = 0 fori > j and fori <j — 2 
1 H + 
UE mA AERE fori =] 
Siy 8; 
(4.6) * 1 1 ; 
puit C SES ETUR foi-2j-1 
Sia — Si 860 — Sip 
1 . ‚ 
UTI EAS A IE, fori =J — 2. 
8:6 — 81 


Similarly, the assumption of diminishing returns to the second input 
leads to the restrictions 


(4.7) B(X'+ Y") 20 where 
B is of order (N —2) XN with elements 
b= 0 fori > j and for i <j —2 
й 1 : 
by = —-——— 2 fort =J 
Ziyi — 2: Н 
(4.8) 1 1 
yp ee PL m ae fori =j—1 
Zin — Zi Zip — 41 ° 
1 
t bi = = =8 j-j—2 
STAKE fori =] 


POINT ESTIMATES OF ORDINATES OF CONCAVE FUNCTIONS 615 


The problem is to find a matrix X such that the sum of squares of its 
elements is a minimum subject to (4.5) and (4.7). In the notation 
already introduced we wish to minimize y(X) subject to the diminish- 
ing returns restrictions letting 


M N 
(4.9) AX) = У У ame = tr XX" 
mi n=l 


where tr is an abbreviation for trace. 

This could be done by iterating on the elements of X subject to the 
restrictions (4.5) and (4.7). 

Alternatively one could consider the dual problem of finding the 
minimax of 
(410) W(X, V. W) = tr XX’ — tr VA(X + Y) — tr WB(X' + Y^) 
where V is an NX(M—2) matrix of coefficients to be determined and 
W is a similar matrix of order M X(N —2). We seek the minimum with 
respect to X and the maximum with respect to V and W subject to 
V20, W20. 

‘As in the single input case, the minimizing values fcr X can be found 
by differentiation. 


д: € 
(4.11) 9 aoy'-yA- B'W' =0 
ax 


(4.12) X = KA'Y' + WB). 
Substituting this into (4.10) yields 
WV, W) = — 1tr VAA'V! — 3 tr VAWB — it WBB'W' 
— tr VAY — tr WBY’. 
To complete the analogy with the single input case we define 
Ө(У, W) = — 2y* =} tr (VAA'V’ + 2VAWB + B'W'WB) 
4-2 tr VAY + 2t ВУ” 


(4.13) 


(4.14) 


Which is to be minimized subject to the nonnegativity of the elements 
of V and W. One could proceed to iterate for the minimizing elements 
of W and V. However, if N and M are very large this will be a long 
process and might be no easier than to compute the original problem 
of minimizing У(Х). It should also be noted that the proof of conver- 
gence in Section 5 does not apply to this case because the quadratic part 
of (4.13) is positive semi definité rather than positive definite. While it 


а 


616 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


seems a reasonable conjecture that the computation would converge, 
this problem needs further investigation. In any problem in which the 
number of linear inequalities exceeds the number of variables in the 
likelihood function, the quadratic part of the expression to be mini- 
maxed in the dual problem will be positive semi definite and the con- 
vergence of the suggested iteration will have to be shown. 

'These and other complications make it hard to.foresee which kinds 
of computing arrangements are likely to be most generally useful. As 
experience indicates more exactly the kinds of relations and restrictions 
to which applied workers want to apply methods like those developed 
here, it will be useful to give more attention to this problem. 


5. PROOF OF CONVERGENCE 


In this section we wish to show that the procedure suggested in 
Section 2 leads in the limit to a unique minimum of the function 6(v) 
for v in the closed positive orthant and therefore to estimates of ordi- 
nates which maximize the likelihood function in its restricted domain.? 
We recall that (€? } is a sequence of vectors in the positive orthant of 
a K dimensional Euclidean space, and that v("*? is obtained from v? by 
adjusting one element of v so as to minimize 0(0) subject to the condi- 
tions that the adjusted element remain non-negative and that other 
elements retain the values assumed in v, 

Recalling also that 


(2.18) olw) = 1»Cv' — 20’ 


where C is positive definite, we note that the sequence {0(0‹"?) } is non- 
increasing and bounded below, therefore it converges. 

Tn addition, it can be argued that 0(0) has a unique minimum for v 
in the (closed) positive orthant. In the first place any minimizing point 
must lie in the intersection of the positive orthant and the ellipsoid 
6(v) S0(v). Since this intersection is closed and bounded, a minimum 
is attained there, Suppose there were two minimizing points, say 0* and 
y**. The line segment joining them would lie in the positive orthant and 
would also lie in the ellipsoid 0(0) <6(v*) —6(v**). Points in the interior 
of this ellipsoid correspond to lower values of @(v) than points on the 
surface, i.e. 000) <(v*) for v in the interior of [»:6(v) <0(0*).] Unless 
v* =v**, the line segment joining them contains interior points and the 
supposition that v*, v** were minimizing points is contradicted. 
———— ем — 

1 The main features of his proof were sugsested BS Roy Radner: ў 


POINT ESTIMATES OF ORDINATES OF CONCAVE FUNCTIONS 617 
We should next like to show that { 
(5.1) “ош |ы®—щФ®®]=0 | for all К. 


pe 


Let m=pK-+k, Then in passing from v("-? to v™, only the kth coordi- 
nate of v changes. Consider the following— 


A(vem—D) — 00099) = Feu ve)? — vot)? 


(5.2) k—1 K 
+E (p 9) — o, D) Усы? + D cei + 2b. |. 
i=l i=k41 


Changing the superscript in (2.20) yields 


5—1 A K E b, 

[7 Ck к 

(2.20) wet) = — > жы yet) — > Bi н 4-2 —. 
i=l Ckk dekql бк Chk 


Substituting into (5.2) we obtain 


006-0) — 0(0")) = = (o, — vet) (p, + +) — 2G - 


(5.3) 
= en (v, 0? — y, Pth))2, 
2 


To justify the mixed inequality we note that if w,(?*9 2:0 then o, 09 Wt 
— w,?*? and the equality holds. If 102+ <0, then v,(?*2 =0 and (y.” 
—2w, +0) >v, 2:0 and the mixed inequality holds. (5.1) follows from 
(5.3) and the convergence of {6(v™)}. 

Let P,(v) be the vector obtained from v by holding all but the k'^ 
component fixed and minimizing 6(v) with respect to v. Py is a continu- 
ous mapping of K dimensional Euclidean space into itself. In our 
sequence [v(? ] we have Q^ 
(5.4) 009) = P, x(v-)) where m ~ K means m modulo K. 


Let Omin =0(Umin) be the value of our function at its minimum for non- 
negative v. Let б„ be the limit of our sequence вое) }. We should like 
to show that 6, 6,55. The vector sequence {v} is bounded. In par- 
ticular the ellipsoid given by 0(0) =0(0) contains the ellipsoids given 
by'&(v) =6(v™) and thus bounds the sequence. {09 therefore has at 
least one limit point and contains a subsequence which converges to 
this limit. Let v? be limit a of {0—9 ) and let {vo} be a subsequence ap- 
proaching v*. For each r identifying an element of the subsequence, let 
m(r) identify the seme element in the original sequence. 


D 


618 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMB 
From the continuity of б, t 3 
(5.5) б„ = 0(v?). 


We shall show that to suppose 6(v*) min involves a contradiction, 
If the supposition is true then it is possible to reduce 6(v) by chang 
& coordinate of v*, i.e, + 


(5.6) 3 k3 Pv?) у у=, 

Let К be the set of all such Ё and let $ 

(5.7) e = min [6(v*) — &(P,(v*))]. 
RE 


Let V be the set of all v in the convex set bounded by the ellip 
9(v) = 0(09). 0(0) is uniformly continuous over T, ie. 


(5.8) 36>03 lv —v*| < >| 6) — 00%) | < e 


for all v, v* e V. 

We proceed to show that our original vector Sequence, [o], 

tains an element P,(v™) within 5 of P,(v®) for a keK. It will then fol 

from (5.8) that | &(P,(v(?)) —6(P.(v9))| <e. Since 0(Р,(0°)) is at l 

e below 6,, 6(Px(v™)) must also be less than 0... But since б„ is the limi 

= 0f 0(v™), this is in contradiction to the definition of [o]. 
From the continuity of the P, we know 


6D ар> 0з |ы – || < p> ||P49) — P|] < è 
for all К. 

From (5.1) we know that successive elements of [v] ean be mad 
arbitrarily close together by making m sufficiently large. We also kn 


that if r is sufficiently large, elements of the subsequence {ym} í 
arbitrarily close to у®, Specifically we.may say 


aMem> M- |v оно z—^ _ 
(8.10) K+1 


aRsr> R> |v —y|| 0. 
K+1 

Now consider an 7 such that 7>R, m=m(?)>M. The K elements 
{v®} immediately following v are all within p of ve. At least one 
them is obtained from the preceding by applying Р, with keK. Su 
an element, P,(v+0) with i an interggr between 0 and K—1, 
within 6 of P;(v*) and reduces 6(v) below Oa, ie., > 


POINT ESTIMATES OF ORDINATES OF CONCAVE FUNCTIONS 619 


(5.11) loe?) — vll < p fosom0sizK-1 
such that v9***1) = P,(v*) for КК. From this and (5.9) we obtain 
(5.12) || POTH») — Рио") || <6 so that by (5.8) 
(5.18) | @(Ру(ь®+9) — Ө(Р,(ь®)) | < =. Then from (5.7) 
(5.14) i &(P,(vvI*?) < 0... 

REFERENCES 


1] Anderson, R. L., “Recent advances in finding best operating conditions,” 

Journal of the American Statistical Association, 48 (1953), 789-98. 

2] Box, G. E. P., and Wilson, K. B., “On the experimental attainment of op- 

timum conditions,” Journal of the Royal Statistical Society, Series B, 13 

(1951), 1-45. 

3] Chernoff, Herman and Divinsky, Nathan, “The computation of maximum- 

likelihood estimates of linear structural relations," Ch. X of Studies in 

Econometric Method, edited by Hood and Koopmans, New York, John 

Wiley and Sons, 1953. 

5] Friedman, Milton and Savage, L. J., "Planning experiments seeking 

maxima," Ch. 13 of Techniques of Statistical Analysis, edited by Eisenhart, 

Hastay and Wallis, New York, McGraw-Hill Book Co., 1947. 

6] Hildreth, Clifford, and Reiter, Stanley, “On the choice of a crop rotation 

plan,” Ch, XI of Activity Analysis of Production and Allocation, edited by 

T. C. Koopmans, New York, John Wiley and Sons, 1951. { 

7] Hotelling, Harold, “Experimental determination of the maximum of a __ 

function,” Annals of Mathematical Statistics, 12 (1941), 20-45. Y 

8] Johnson, Paul R., “Alternative functions for analyzing a fertilizer-yield 
relationship,” Journal of Farm Economics, 35 (1953), 519-29. 

9] Koopmans, Tjalling C. and Hood, William C., “The estimation of simul- 
taneous linear economic relations," Ch. VI of Studies in Econometric Method, 
edited by Hood and Koopmans, New York, John Wiley and Sons, 1953. 

[10] Krantz, B. A., Fertilize Corn for Higher Yields, Bulletin 366 of the North 
Carolina Agricultural Experiment Station, Raleigh, 1946. 

[11] Kuhn, H. W. and Tucker, A. W., “Nonlinear programming," Proceedinga of 
the Second Berkeley Symposium on‘ Mathematical Statistics and Probability, 
edited by Jerzy Neyman, Berkeley, University of California Press, 1951. 

[12] Prais, S. J., “Non-linear estimation of the Engel Curves," Review of Economic 
Studies, 20 (1953), 87-104. 

[13] Robbins, Herbert and Monro, Sutton, “A stochastic approximation 

method,” Annals of Mathematical Statistics, 22 (1951), 400-7. 


APPROXIMATE DISTRIBUTION OF THE RANGE IN THE 
NEIGHBORHOOD OF LOW PERCENTAGE POINTS* 


Maurice Н, Beuz,{ University of Melbourne 
AND 
Roserr Нооке, Princeton University 


I. GENERAL 

1. Introduction 
HE investigations described below originated in a study of the up- 
qe probabilities associated with the distribution of the range of 
а small sample from a normal population, If the cumulative distribu- 
tion function for such a range is plotted on probability paper, as in 
Hald [3], it is observed that the curves (for different sample sizes) tend 
to become closely linear as the range increases. The same feature ap- 
pears to characterize the distribution of the range for small samples 
Írom other continuous populations, for example, the x? distributions 
. for 2 and 4 degrees of freedom and the double negative exponential 
distribution; see Fig. 1. Attempts to discover the reason for this prop- 


Probability 


Range 


Fia. 1. Distribution of range of samples of 5 from various populations. 


* Prepared in connection with research sponsored, in port, by the U. 8. Office of Naval Research. 
t Research Associate, Princeton University, 1952. ® 2 


620 


DISTRIBUTION OF LOW PERCENTAGE POINTS А 621 


erty have led to some general approximation procedures for estimating 
upper-tail probabilities for the distribution of the range of small sam- 
ples. 

If a1<a2< - ++ <a, represent the ordered members of a sample of 
size n, the range is т, — 21. If this statistic is such that the probability of 
its being exceeded is low, say of the order of .05 or .01, the effect of 
£n оп zı might well.be supposed to be small, whatever the value of n. 
This suggests, first, the investigation of an approximation based on the 
notion of complete independence of z; and z,. Secondly, one may at- 
tach to this the assumption that z; and z, are normally distributed, a 
result that is known to be approximately true, for small values of n, 
when the parent population is normal [6]. Results of the application of 
these procedures to small samples from various populations are re- 
ported below. 


2. Notation and Summary of Results 
The following notation is employed throughout. For a>0, 


Р\(а) =Probability that the range exceeds a; 

Р,(а) =approximation to the probability of the same event, based on 
the supposition that z; and z, are independent, with means 
and variances characteristic of their actual distributions. 

P;(a) =approximation to the probability of the same event, based now. —- 
on the supposition that ту and z, are independently and nor- 
mally distributed, with means and variances characteristic 
of their actual distributions; 

R(a) = P,(a)/P2(a); į А 

R=limpa)-0 R(a). 


It is shown that, in general, R is not equal to 1, the value that might 
intuitively have been expected, so that the indicated approximations 
to Р\(а) are RP.(a) and RP;(a)For any distribution defined over a 
finite interval, R=(n—1)/n. For distributions having just one infinite 
tail, R lies between (n—1)/n and 1 and can, in fact, be as large as 1 
or as small as (n—1)/n. These results appear also to be true in the case 
of distributions with two infinite tails, though no proofs have yet been 
obtained. 

Numerical applications of the approximation procedures have been 
made to particular distributions, mostly with n=5 and with emphasis 
on those values of a which place Р; (а) between .05 and .01, the inter- 
val of most interest in practice. The results are summarized in Table 1. 


622 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


TABLE I 
TAIL PROBABILITIES FOR THE DISTRIBUTION OF THE RANGE 
(n=5 unless otherwise noted) 
Probabilities 
Distribution a Approximate | Ка R | RP:(a) RP:(a) 
Ра) | Paa) Рқа) 
Rectangular .92 | .0544 | .0646 .1017 | .842 8 0517 .0814 
Jæ) =i for 0 Sz $1 ‚954 | .0193 0234 .0746 825 .8 0187  .0597 
1. 99 Я Я 0 
Кичи ы мю | o аш шаа аш ан 
2—21071 5252] 1.8 10018 | :0016 .0077 807 .8 0013 .0002 
Beta, 2,2 7 1403 +1565 8 .1252 
fa) ial =2) 178 | .0497 10720 18 10576 
for 03231 8 0355 10577 E 10461 
185 | 0124 10817 t 10254 
24.f. 4.36 | .050 4 i 4 
жеги ж € 3E NE E 
[D TTA 6 .0461 | .0492 .0318 | .937 .9262 | .0456  .0295 
йа. n-1 
Os2< 
CN when mu 
Hi 50 @ .0571 2 3 
TOTAL say 100 10008 | 0з % 1 “0308 
for 0 Sz = (п 3) 
в ома | .0070 | .972 1 | .0570 
ЕП 1+), 0 à к 
Гаа) м ee 1 0233 | 0238 1980 1 10238 
| 2. | .oo10 | .096 .198 | . 3 
йе, HP ШЕЕ ЕЕ 
4 10914 | .0222  :0080 | ‘965 1 0222 10080 
аура: 
араас 
Double Negative 6 0589 E 
8.2 1008 | бо 1000 Ba 8 з 10000 


TET. e 


Hyperbolic 


mati 05 таз, 
(A unrestricted) 
Biapderd Normal $5 
olg. Me 4.00 
for =» «zc HEU 
4.29 
a-s] 461 
4.99 
4.47 
(n=10){| 4:70 | | 
e || 5.16 | . 0000 | . A 
1 з ETARE 
Pp 100256 |* `00271 1045 6 
this 20309198 marks indicate that, although is akon to b 0.8, this has not been established in 


KA P 


DISTRIBUTION OF LOW PERCENTAGE POINTS 623 


8. Conclusions 


Table I shows that RP;(a) provides a fair to good approximation to 
P,(a) if P:(a) is near .05 or if the parent distribution is normal. When 
the parent population is unknown it therefore seems reasonable to sup- 
pose that RP;(a) will be a good approximation to Р,(а) provided that 
the distribution of the parent is not too far from normal. Even when this 
condition is not fulfilled, the variety of examples in Table I indicates 
that the approximation is likely to be good if P,(a) is near to .05. 

The results of exaniples involving two-tailed distributions are con- 
sistent with the suggestion that R lies between (n — 1)/n and 1. In all 
the cases examined the ratio (а) was found to lie between this same 
pair of values. Failure to know the exact value of R is not very serious, 
for even when n is as small as 5 the range of R is only from .8 to 1, 
and whatever the value one takes for R in this range no error of any 
consequence is likely to be committed. 

It is shown in Section 7 that when the parent population is unknown 
and cannot be assumed to be normal, the use of the approximation 
RP;(a) is likely to be considerably better than that obtained by the 
direct method of examining observed ranges, when the amount of data 
is small. This is supported by some results obtained from drawings from 
various populations. 

The closeness of the approximations obtained for distributions de- 
fined over a finite interval as well as for the normal distribution sug- 
gests that the procedure might be usefully extended to contrasts 
other than the range. Tables of percentage points to cover quite simple 
contrasts are not now available, even for a normal parent, and some 
approximation procedure must accordingly be adopted. An investiga- 
tion of the contrast т, —1(z;4-2) has been carried out along the above 
lines for samples of 5 from four different populations, and in Section 
8 it is shown that the approximation RP3(a) is remarkably close to the 
true probability P;(a) when this is of the order .05 to .01. The analyti- 
cal results are again supported by means of random drawings. 


IL THE SUPPOSITION OF INDEPENDENCE—THEORETICAL RESULTS + 
4. Expressions for F;(a), P2(a), R(a) 


Let f(x), F(z) be the density function and distribution function, 
respectively, and let (a, 8) be the interval over which z is defined. 
Then we have the following results where, for simplicity, u is written 
in place of z, and v in place of £n: 


(1) Ра) = Pr (v — u > a), х (а > 0) 


$ 


624 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


2 f n(n — Df) (FU) — Flu) ]73f(0)duds, 
Q) Pala) = if naftu) {1 — Fu) } "еу о) ид, 


Pila) n—1 f (FQ) — Fu) }-f(wf(vr)dudv 


(nO) mm mt 
P,(a) n f {1 — Flu) |=") (ау) дио 


› 


where о is the relevant portion of the half-plane for which v» u4-a. 

In evaluating the integrals it is convenient to integrate first with 
respect to v, a distribution-free procedure, the terminals for the integral 
being u+a and В. The integration with respect to u is then from o to 
B —a. We find the following expressions, 


Ра) 21— {1 — F(8 — a)]* 


(9 =n f flu) {Fu + а) — F(a}, 


Pia) = 1— {1 — F(8 — a)}* 


45 be 
9 us f flu) {1 — F(0)] «Pu + a)du, 


with the aid of which most of the numerical yalues in Table I have been 
computed, 


6. The Limit R 


"The assumption involved ір, computing Р,(а), that z, and z, are 
independent, is equivalent to assuming that 2; is contrasted with the 
Greatest order statistic, y, say, of an independently drawn sample of 

| the same size, When ais extreme, in the sense that the probability of its 
being exceeded is small, it might be expected that the expressions 
F(a), Р+(а) will have an interesting relationship, We are led to con- 
sider the limiting value of the ratio R(a) as a increases to its greatest 
value or indefinitely, as the case may be, that is, as Р,(а) and Р,(а) both 
converge to zero. 

Consider, at first, the case where a, В are both finite, The region 
© is now the triangle bounded by the lines u=a, »=8, v—u-ra. On 
applying an appropriate mean-value theorem to each of the integrals 
appearing in (3), we obtain the form 


{. 
5 
| 
Р 
! 


| 
| 


DISTRIBUTION OF LOW PERCENTAGE POINTS ~ ү 625 


[E09 — re) [дуй ^ 


m= 


(6) R(a) = Й 
СО 


where (u*, v*), (u**, v**) are points in о, If a is allowed to approach 
its greatest value 8—a, the starred points both converge to the point 
(о, B), во that each of the expressions in braces in (6) converges to 1. 
It follows that the limit of R(a) exists and is given by 


7 


(7) R= 
n 

This result, though contradicting the intuitive notion referred to 
above and which suggests the value 1 for R, is nevertheless distribu- 
tion-free and independent of о and 8. One might, therefore, be tempted 
to suppose that (7) would extend immediately to distributions for which 
one or both of a, В were not finite, on the grounds that truncation at 
a sufficiently remote point could have no appreciable effect on the dis- 
tribution, Such a generalization, however, is false, as shown by the 
results in Table I. 


We now sketch the outlines of a proof of the theorem that for one- 


tailed distributions the limit R exists and satisfies the double inequality 


oy 
(8) iim кү 
n 


We may, without loss of generality, take the interval of definition as 
0<< c. On integrating (1) and (2) with respect to v and expanding 
the subsequent integrands by the binomial theorem, R(a) can be ex- 
pressed in the form А e 


n Y Ü X | (—1)#1ф„1—.4® 
(9) Ra) ss OE 


dee 


where 


(10) ые = f адаи + абда 
0 


626 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


мі 
and + 
(11) G(u) = 1 — F(u). 

In the successive terms of the sums in (9), with 7 increasing, the 
powers of G(u--a) that occur increase while those of G(u) do not. By 
considering the functions 

f e ostudu 
(12) І,.9.9 = - 2 › 
f G*(u)G*(u + a)f(u)du { 
0 4 
[| Gr(u)G1(u + a)f(u)du 
(13) Jua» = £ Д 
f G^"(u)G«(u + a)f(u)du 
0 
where 6 may have the value 0 or 1, it is readily shown that, uniformly 
in аїог0<а<«, ) 
(14) lim Т,» = 0 ў 
апа that 
(15) “Лео < G(2)/(G()]*. 


И 6=1 we may choose z such that G(z) = /G(a), and we thus have 


Í "eer + a)f(u)du 


лыт с e Jua dE Tot 
z Gr (u)G*(u + a)f(u)du ^ 1+ 144,460.29 


«1+ Loan 
< {G(a)} 7 
r3, | 


where e has the value 1 or j. On letting a (and Зу tend to infinity, we 


reach the result 


an Rane! фае, 


. NM азе uaa 
ae 


DISTRIBUTION OF LOW PERCENTAGE POINTS 627 


Since $, 1309 <¢n-21@, we establish the first half of the double in- 
equality, namely 


—1 
(18) nee. 


n 


Finally, on integrating $4249 and $, 110 by parts, we obtain the 
relations 


G(a) — f, exe + a)du 
- — «1, 
G(a) — f G^(u)f(u + a)du 


n=l $29 


dn—1,1 


(19) 


and the second of the desired inequalities, namely, 
(20) RSI, 


follows at once. 

The result (8) is thus established. 

Attempts to prove the same result for two-tailed distributions have 
not been successful, although the numerical results obtained suggest 
that it is true in general ~ 


ш. THE SUPPOSITION OF NORMALITY 
6. The Approximation Ps(a) ү 


If we suppose that zı and z, are not only independent but also 
normally distributed, their difference z,—23, that is, the range, is also 
normally distributed and, knowing the parameters of this distribution, 
the probability that the range will exceed any given value a can be 
obtained by entering the standard normal tables. This approximation 
to Р,(а) is denoted by P;(a). The*required parameters are found from 
the mean and variance (when they exist) of the distributions of each 
of the end order statistics. When the density function f(z) is known, 
their determination is straightforward. For some of the distributions 
considered in Table I use has been made of the tabulated results of 
Hastings, Mosteller, Tukey and Winsor [5] and of Godwin [2]; for 
others direct calculations were made. 

If m, р, are the expected values and c, c," the variances of zi, 
t, respectively, the combined assumption of independence and nor- 
mality leads to the standard normal variate é 


628 АМЕВІСАМ STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


( — ж) — (ым) _ 


Bro) Me Een, Bay 


Vor + on? 
The value of P;(a) is then given by 
a — (tn — 2) 
мо? + со)“ 


The results obtained by the application of this formula are set out 
in Table I. ? 


(22) Р,(а) = Pr $ > 


7. Application of the Approzimation RP;(a) 


When samples are drawn from a given population, it is necessary to 


. have some knowledge of the percentage points of the distribution 


of the range in order to decide whether the range obtained in any par- 


. ticular case is significantly large. For this purpose we may have avail- 


able data from k independent samples each of size n. If the parent 
population is normal and an unbiased estimate of its variance is found 


_ from these samples, the required percentage points of the distribution 


can be found by referring to published tables of the “Studentized” 
range [4]. Since it is not our purpose to try to improve on this method, 
any application of the approximations discussed in this paper must be 


—*o (a) cases involving non-normal distributions, or (b) problems con- 


_ cerned with contzasts for which no tables are available. 


A survey of Table I shows that the approximation RP;(a) is likely 
to prove most useful for non-normal populations in the vicinity of the 
5 per cent; point, Restricting ourselves to those entries in the table for 
which P;(a) is about .05, we note first that the error made in using 
RP,(a) in place of P,(a) is around 16% of P;(a) in the case of normal 
samples (15.77% for samples of size 5, 16.5% for samples of 8 and 16.8% 
for samples of 10), while for the others it ranges from 3.0% (symmetrical 
triangular) to 65.4% (one-tailed hyperbolic). It is of interest to compare 
clean with errors of estimation that may be met in practical situ- 

"The direct way of using the k samples of n in the non-normal case 
would be to compute the corresponding ranges and determine the 
quantile à. say, for this sample of Ё values. If this statistic is to be 
used for estimating the corresponding population quantile Q.ss, we 
shall need to investigate an appropriate confidence intarval for Qo 
The Magnitude of this interval is indicated by the corresponding con- 
fidence interval for the parameter «ĉin the binomial distribution 


5 


DISTRIBUTION OF LOW PERCENTAGE POINTS 629 


€) s^ (1—)*-^, where № is the number of "successes" observed in the 
sample of k, here the number of sample ranges greater than or equal 
fo ws. If k=180 and h=9, the 95 per cent. confidence limits for т 
are .095 and .024, representing errors of 90% and 52%, respectively, 
if r=.05; if k— 100 and h=5, the corresponding limits are .112 and .017, 
representing errors of 124% and 66%; if k=60 and h=3, the limits are 
.140 and .019, representing errors of 180% and 80%; and if k=20 and 
h=1, the limits are .249 and .001, representing errors of 398% and 98%. 
Hence the errors determined above for the approximation RP;(a) are 
exceeded by those obtained by using the direct, method, based on the 
95 per cent confidence interval, when 100 samples of 5 are available, 
and are extremely likely to be exceeded when even as many as 180 
samples of 5 are available. 

It must be recalled, however, that the errors so far associated with 
RP;(a) have been computed on the basis of known means and variances 
for the distributions of a, and £n. When, as will happen in practice, the 
density function f(x) is unknown, the relevant parameters must be esti- 
mated from the observations themselves, and the use of these estimates 
in place of the true values will modify the error in approximating to 
Pi(a). 7 

To investigate this effect, let us assume that the parent population . 
is symmetrical and that we have available 20 independent samples of 
5 observations each, so that k=20. n=5. Denote the estimates of the— - 
means and variances of 21, zs by 2i, 25 and si^, sẹ, respectively. No two 
of these estimates are strictly independent, but the dependence is 
probably not strong and, as we are interested here chiefly in rough 
comparisons, we shall assume independence. On the basis of the ap- 
proximate normality of distribution of the order statistics zi, vs, Te- 
ferred to in Section 1, the standard error of 2—24 is 2/10 while that 
of both s? and sj is c2//9.5, о? being the common variance of zi 
and ту. On replacing us—j1 by 2-2: and оов by s?--s?, we have, ` 
in place of (22), the approximation, Р;*(а) say, given by 

аз = A 

NES 
where г is N(0, 1). The working approximation to P,(a) is now RP;*(a) 
where, for n=5, R —.8 for all distributions defined over finite intervals 
and may presumably be taken as .8 for all two-tailed distributions that 
do not depart’ widely from normality. 

In the special case of a normally distributed parent population we 
have, quoting from [2], t 


(23) P;*(a) = Pr f > 


630 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


в = — m = 1.16296, а? = o = .447535, 


giving, for a=3.86, Pi(a)=.0498 and P;(a) =.0525. If z5— i overesti- 
mates us—pı by one standard error, say, and simultaneously cj, 
os? are each overestimated by one standard error, we find P;*(a) =.1122 
and .8P;*(a) =.0898. We have taken one standard error in each case 
rather than the customary two, which correspond roughly to the 95 
per cent confidence limits, but we have supposed that the errors ac- 
cumulate whereas they will often tend to cancel. Even though the 
-result .0898 overestimates Pi(a) by about 80%, it compares quite 
favourably with the upper 95 per cent confidence limit obtained by 
the direct method. A similar procedure on the side of underestimation 
gives .8P;*(a) = .0099, as compared with .001, the lower 95 per cent 
confidence limit obtained by the direct method. 

The above conjectures have been tested by means of random draw- 
ings of samples of 5 from (7) the standard normal distribution, using 
the tables of random deviates given in [7]; (2i) the rectangular distribu- 
tion f(z) =1 for 0Sz € 1, using the five-figure tables of random num- 
bers given in [1] and [9]; and (zi?) the beta distribution for the values 
2, 2 of the parameters, namely f(x) -6z(1—2) for 0X z € 1, using five- 
figure tables of random numbers in conjunction with the /-tables for 
4 degrees of freedom given in [4] and subsequently transforming the | 

77v variate values found. The results are collected in Table II, correspond- 
ing to the values 20, 60, 100 and 180 of k and to the value .8 of R. 
(In each case, the 180 samples are the compound of all the preceding 
ones.) The values of the relevant parameters are also included, corre- 
sponding to the value © of №, 

It is observed that for the normal and beta distributions the ap- 
proximation RP;*(a) is extremely good for all values of k. For the rec- 
tangular distribution it is only slightly less satisfaetory. In only three 
Situations is the difference between»u,— ш and 25— 42у of the same sign 4 
as that between ат?-- о; and з;24-8;2, The greatest percentage error in 
Ps¥(a) relative to Р,(а) on the side of overestimation is 46.7 (rectangu- 
lar for k=20) and on the side of underestimation 53.8 (beta for k 
FC e case have errors as large as those conjectured above for j 

nal ease (114% on the side of overestimation, 77% on that of | 
underestimation) been approached. | 


We conclude that the'use ‘of the approximation [(n—1)/n]P* (a) is, for | 


samples of small size, preferable to the direct method when the number 
of samples available lies between 20 and 180. 
з 


Л 


DISTRIBUTION OF LOW PERCENTAGE POINTS 631 


TABLE II 
RESULTS OF RANDOM DRAWINGS—CONTRAST 2—21 


Standardized Percentage 
overestimation error in 
Distribution k a Ф СЫ at .8Pi*(a) of P:*(a) 
i relative to 

mmm | otto? | Риа) 

Rectangular 20 ^.1138 .8416 .01861 .01555 .1193 — .88 — .86 40.7 

в=.92 60 .1420  .8354 .01384 .01710 .0692 +1.04 —2.38 —14.9 

Pi(a)=.0544 100 .1516 .8407 .01773 .02060 .0953 +1.12 — .48 +17.1 

.8Ру(а)=.0814 180 +1442  .8390 ,.01648 .01869 .0919 +1.89 —2.14 +13.0 
LJ 21667 .8333 .01984 .01984 

Beta 20 .2272  .7686 .00543 .01150 0266 451 —2.83 —53.8 

a=.78 60 .2235 .7698 .01104 .01603 .0622 +1.09 —147 +81 

Р\(а)=.0497 100 .2417 7434 .02077 .01422 «0547 —141 +1.66 — 5.0 

.BPi(a) = .0576 180 .2340 .7550 .01578 .01454 .0547 — .03 — .59 — 5.0 
LJ .9393  .7607 .01565 .01565 

Normal 20 —1.177 1.108 .4228 „7307 +0570, — .20 +1.85 +35.8 

a=3.86 60 —1.169 1.203 .5054 „3849 0459 + .38 = .06 + 9.3 

Pi(a) =.0498 100 —1.023 1.115 .5046 .3822 0270 1.98 — 13 —35.8 

+ 8Ps(a) = 0420. 180 —1.089 1.144  .4959 .4176 .0354 —1.31 + Al —15.6 
LJ —1.163 1.163 .4475 .4475 


IV. EXTENSIONS TO OTHER CONTRASTS 
8. The Contrast 2, —5(xid-22) s 
The general procedure outlined above can be applied to contrasts 

other than the range. As a simple extension we shall consider the con- 
trast 2, —3(2;--2:). The expression for P;(a) is obtained from the joint 
probability density function of the three order statistics written, upon 
integration over the region for which the inequalities 227 2: 2,2 3(* 
+25), 247» 2» are simultaneously satisfied. The integrand is here 


(24) n(n — 1)(n — 2)f(zof(z) (Е (ан) — Fas) } fen). 


Assuming that f(z) is defined oversthe interval (— ©, ©) and that the 
order of integration is z,, 22, 21 in turn, the integral is most easily 
evaluated as the sum of two triple integrals for which the terminals 
are, respectively in order, 3 


e T ra, =), (my, 2x + 2a), (— 5, =) 


and (za, ©), (z;--2a, 5), (— «o, œ). When the density function is de- 
fined over a finite interval (о, 8), the integral is evaluated as a triple 
integral for which the terminals are, in order, А 
€ 
. 


682 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


Cl +4, в), (1,28 —2a — 2), (8 — a). 


In forming the approximation P.(a) we assume that т and хт» are 
correlated but are both independent of z,, leading to the integrand 


(25) n*(n — 1) (о) (о) {1 — F(a.) 297 (29) (а). 


When f(z) is defined over the finite interval (o, 8) it is readily shown 
that the ratio R(a)=P;(a)/P2(a) converges to the limit 


(20) R = (п — 2)n, 


as а converges to 8— о. No attempt has been made to investigate bounds 
for R when f(x) is defined under more general conditions, although it 
‘is conjectured that the above limit is applicable to the normal case 
and that, in general, the upper bound is 1. 

In investigating the approximation P;(a), which is based on the as- 
sumption that the variates 3(z1--2:) and z, are normally and independ- 
ently distributed, we require the means and variances of zi 2», %n 
as well as the covariance between 2; and zə. If these parameters are, in 


turn, Hi, из, us and oy’, es^, on”, тз, the present assumption leads to the 
standard normal variate 


eem 07) ghee [en 59 ic T 23)] ae [us = ECT sd ua) | 
‚ Wear? + o? F 20) + on? 
во that, for given a>0, \ 


09 Pa) - Prf[e> r анн) 
ES [ i Vilar? + o? + Qos) + al 3 


The indicated approximaticns to Р,(а) are thus [(n—2)/n]P2(a) and 
[n—2)/n]P«(a). 

‘ If the density function f(x) is unknown, use may be made of the last- 
written formula in approximating to P,(a), where the parameters ap- 
pearing on the right-hand side of the expression (28) for P;(a) are to be 
estimated. As before, we may have available k samples of n for this 
purpose. When the appropriate estimates are inserted, the correspond- 
ing probability may be denoted by P;*(a), and for distributions that 
are defined over a finite interval the working approximation to Pi(a) 
then becomes [n —2)/n]P;*(a). The same rule may be presumed for 
distributions that are normal or nearly normal. 

Some of tue above considerations haye been applied to samples of 5 
from populations previously considered, namely the rectangular dis- 


DISTRIBUTION OF LOW PERCENTAGE POINTS М 633 


tribution, the symmetrical triangular distribution, the beta distribu- 
tion with parameters 2, 2, and the normal distribution. 

For the first of these we have f(z) 21 for 0521 and the integrals 
defining P:(a), P2(a) are readily resolved. To find Рз(а) we used low 
moments given in [5]. 

For the triangular distribution f(x) =z for 05251 and f(z)=2—2 
for 1<2<2, the evaluation of the integrals tends to become tedious 
and, accordingly, P2(a) was not computed. For the beta distribution 
with parameters 2, 2, f(z) 26x (1—2) for 0Szx1, the evaluation of 
Р,(а), though ии laborious, was carried through for three 
values of a. | 

For the standard normal distribution, P:(a) was evaluated by the 
method of double quadrature, using Simpson’s rule. Here P;(a) was 
first expressed in the form 


pia) =1-20f  f a++ Fa t e+ #0 
= Е(а + х2 + u)}8duda 


(29) 


and intervals of 0.2 were chosen for both м and =. 

The results are collected in Table IIT. It will be observed that when 
P,(a) lies between .01 and .05, RP;(a) affords an extremely close ap- 
proximation to Р,(а) for the triangular, beta and normal distributions, 
while for the rectangular distribution the approximation is excellent, 


TABLE III 
ТАП, PROBABILITIES FOR THE DISTRIBUTION OF zs —4 (a1 4-23) 


Distribution a Pia) | Psa) | Руа) RP) 
Rectangular .848 .0507 .0704 .0947 .0568 
Ha) =1 1005251 .900 | .0162 |..0289 | .0582 .0349 
.915 | .0103 | .0154 | .0500 .0300 
Bymmetrical triangular 1.303 0503 
i 4 .0720 .0482 
ш) [219105251 1.480 | .0105 0228 0137 
2—5 for 1 Sz 52 i 
je se FS PEOR SES 
Beta, parameters 2, 2 .68 .0621 -0907 0 
Fa) =62 (1—2) 70 .0452 .0731 .0438 
forüszsi E .0184 «0404 .0242 
Standard normal (* 
1 М 3.0 .1004 .1221 .0733 
Fa) »—— ihi 3.2 .0676 .0815 .0489 
Vix 3.4 .0464 .0519 .0312 
for — o <z<0 4.0 0094 .0102 .0061 


= 
* Question marks heré hgve the same significance as in Table I. 


634 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


when P,(a) is of the order of .05 and only slightly astray for smaller 
values of Pi(a). 

The approximations Ps(a) are based on known values of the param- 
eters in the distributions of the indicated order statistics. As men- 
tioned above, these parameters will, in practice, be estimated from the 
data provided by groups of samples. To investigate the effect of this 
procedure on the working approximation to Pi(a), samples of 5 were 
drawn from the beta and normal distributions, in manners previously 
described, and the values of т, 22, 25 were listed. For the beta distribu- 
tion, three sets of 20 samples were drawn randomly from an original 
lot.of 100 randomly selected samples; for the normal distribution one 
set of 20 samples and four independent sets of 100 samples were drawn. 

The results of these drawings are collected in Table IV for the values 
of k mentioned, together with the values of P:(a) and the working ap- 
proximation .6P;*(a) for selected values of a. For convenience in making 
comparisons, the table includes the values of the parameters them- 
selves and of the approximation .6P;(a), corresponding to the value 
© of k. 


TABLE IV 
RESULTS OF RANDOM DRAWINGS—CONTRAST as—43(2: +22) 


~Mistribution k a E 5 D at a at -6Р,*(а) 
Beta 1 a=.68 a=.70 a=.75 
Со e S CS, 
P(.09-.0021| 20 .2812 — .4157  .7060 .03180 .02402 .02203 .01105 | .0274 .0219 .0120 
P,(.70)=.0452| 20 .2712 „4791  .7040 .01632 .01347 .00826 .01185 | .0173 .0127 .0056 
P,(.75)=.0184| 20 .2464 „4305  .7847 .01572 .01933 .01309 .01303 | .0494 .0305 .0213 
100 «2417 — .4097  .7434 .02077 .02367 .01524 .01422 | .0455 .0360 .0201 
е 22398 .9786 .7607 .01565 .01760 .01040 .01505 | .0544 .0438 .0242 

ай 
Normal f a=3 a=3.4 a=4 
» =. 
P,(3)=.1004 | 20 | —1.049  —.391 1.280 .4395 .2260 

н : E А 21365 .4229 10658 .0253 .0041 
раан 100 | —1.280  —.559 1.136  .4852 .3305 .2528 .4754 .0884 .0406 .0092 
10) =.0094 | 100 | —1.2344 —.499 1.243 5314 .3394 .2428  .5578 .1042 .0519 .0138 


100 | —1.228 —.478 1.190 .4750 .3085 .23% 4831 .0853 .0387 .0086 
' 100 | —1.228 —.483 1135  .3951 .3246  .1843  .3903 .0651 .0254 .0042 


| —1.163 —.495 1.163 .4475 315 .2243 724475 | .0733 .0312 .0061 


, In the first set of 20 drawings from the beta distribution the error 
in P;*(a) relative to P;(a) is, for each value of a, of the order of 50%; 
for the second set of 20 it is of the order, of 73%; for the third set it is 
of the order of 10%; while the average of the three sets is of the order 


DISTRIBUTION OF LOW PERCENTAGE POINTS 635 


of 44%. For the original set of 100 drawings the error is of the order of 
17%. 

For the normal samples, the set of 20 drawings leads to errors of 11% 
when a=3, 19% when a=3.4 and 33% when a=4. The means of the 
errors for the four sets of 100 drawings are 23% when a=3, 35% when 
a=3.4 and 62% when a—4. 

One set of 20 from the beta distribution yields low percentage errors 
while the other two have high values. The errors exhibited by the single 
set of 20 samples from.the normal distribution appear to be exception- 
ally low. The first three sets of 100 drawn each display higher errors, 
the second being, perhaps, exceptionally high. The fourth set has errors 
almost identical with those of the independent set of 20. 

From these considerations it appears that about 100 samples of 5 
would seem to be required in order that the approximation .6Р;*(а) 
should provide a reliable test criterion. 

As in Section 6, the direct method of procedure would lead to errors 
of 124% and 66%, corresponding to the 95 per cent confidence limits 
for т when r=.05, k=100, h=5. Upon examination of Table IV we 
find that for the beta distribution, with k=100 and a=.7, the percent- 
age error in .6P;*(a) relative to Pi(a) is 20% on the side of underesti- 
mation. For the normal distribution, with k=100 and a —3.4, the errors 
vary from 12% on the side of overestimation to 45% on the side of un- 
derestimation. Бя 

We conclude that the use of the approximation b(n —2)/n]Ps*(a) is 
preferable to the direct method in the circumstances described. 


9. Concluding Remarks 


For any contrast it is possible to write down immediately the expres- 
sions for P;(a) and Р, (а), the latter beipg based on convenient assump- 
tions of independence between the groups of order statistics that 
enter into the contrast. When, ‘as in all practical situations, f(z) is 
defined over a finite interval (a, 8), the calculation of R is straightfor- 
ward. 

Thus, considering the contrast 2(2„ 34-25) —i(a1+a2) and assuming 
that the end pairs ef order statistics are independent of each other in 
forming the approximation P.(a), we find, for any distribution defined 
over a finite interval, 

• — 2) (п —8 
Рано, 
n(n — 1) 2 
while, for the contrast dn —3 (5 d-zs--24) we find, under like conditions, 


636° AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 
n-—3 
n 


R= 


With the aid of such coefficients working approximations to the true | | 
probability P,(a), involving the principle of normality, can be de- 
veloped generally by direct extensions of the methods described above. 


10. Acknowledgments 


Our thanks are due to Professor J. W. Tukey, who drew our attention 
to the original problem and who offered critical and constructive advice 
during the progress of the study. We are indebted to Mr. H. J. 
Godwin for some suggestions regarding the evaluation of P2(a) for 
the range of the standard normal distribution, and to Professor E. R. | 
Love for a proof of the limit (n —1)/n for R for the general chi-square | 
distribution when m increases indefinitely. For assistance with the vast — — 

' amount of computing that lies concealed behind the tabular entries 
throughout the paper our special thanks are due to Miss Alison Doig. 


REFERENCES 
[1] Fisher, R. A., and Yates, F., Statistical Tables etc., Edinburgh, Oliver and 
Boyd, 1942. 
[2] Godwin, H. J., “Some Low Moments of Order Statistics,” Annals of Mathe- 
E matical Statistics, 20 (1949),279-85. 
15] Hald, A., Statistical Theory with Engineering Applications, New York, John 
Wiley and Sons, 1952, p. 320. г 
[4] Hartley, Н. O., and Pearson, E. S., “Tables of the Probability Integral of 
the t-Distribution,” Biometrika, 37 (1950), 168. 
[5] Hastings, C., Jr, Mosteller, F., Tukey, J. W., and Winsor, C. P., “Low | 
Moments for Small Samples: A Comparative Study of Order Statistics,” ; 
Annals of Mathematical Statistics, 18 (1947), 413-26. 
[6] Kendall, M. G., The Advanced Theory of Statistics, v. 1, London, Charles 
Griffin and Co., (1943), p. 221. “ 
[7] Mahalanobis, P. C., ed., “Tables of Random Samples from a Normal Popu- 
lation, Sankhya, 1 (1934), 289-328. 
[8] Pearson, E. S., and Hartley, H. O., "Tables of the Probability Integral of the 
‘Studentized’ Range," Biometrika, 33 (1943) 89. 


19) andom Digits” Journal of the American Statistical Association, 47 (1952), 


OPTIMUM GROUPING IN ONE-CRITERION VARIANCE 
COMPONENTS ANALYSIS 


E. P. Krye* 
National Bureau of Standards 


rsrs of significance associated with the single criterion analysis of 
sas usually assume that a sample of n observations is drawn 
from each of m normal populations with common variance o°. In the 
“components of variance” model, the m population means are them- 
selves considered a sample of m observations on a superpopulation, 
also normal, with variance 6'c?, The null hypothesis 0—0 is tested 
against the alternative 07-0 by means of the F ratio. Detailed descrip- 
tions of this model are given by Eisenhart, ([3] and [4]) ; Ferris, Grubbs, 
and Weaver [5]; and Crump [2]. 

When the number of populations is indefinite and the total number of 
observations N — mn is limited, it is possible to determine which (m, n) 
combination gives the most powerful F-test for a given N. This prob- 
lem was considered in [1], [4], and [5]. For all cases examined, each (m, 
n) combination in turn was found to proyide the most powerful test 
for some interval of 0. < 

An extensive list of these optimum groupings would seem, however, 
of limited practical interest; for the applied statistician is seldom in a 
position to specify in advance the magnitude of @ which he desires to 
detect. To remove the need of a priori information, this note proposes : 
а simple rule for selecting m and n which gives nearly maximum power 
for all 6>0.! The rule is to select m and n as nearly equal as possible, 
The operating characteristics obtained from this selection are shown 
in Figure 1 for m=n=8, 10, 12,anä 16 when the test is conducted at 
the 5 per cent level of significance. The values used to plot the curves 
were obtained for the most part by the well-known method outlined 
in [5] using the percentiles of the F-distribution as given in [6]. Table 
8.4 in [4] also served as a valuable check on the results. The dashed 
curves enclose the Sperating characteristics for all the (т, n) combina- 
tions that can be formed from the given amount of data. These curves 
approximate the upper and lower envelopes for the family; no single 
(m, n) choide will yield either curve as its operating characteristic. 

Groupings in which m is much less than are more powerful than . 


* Now wi ili 
ow with Eli Lilly and Company. © а 


1 It should be emphasized that this paper deals wholly with significance tests rather than estimation. 
If one is interested in estimating 0 or 00°, the suggested procedure may not be at all optimum, 


В бате В, 


638 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


those for which m is approximately equal to n for very small values of 
6, but in this range no grouping gives a test of sufficient power for prac- 
tical use. Choosing m much greater than n gives a slightly more power- 
ful test for very large values of 6, but in this range the choice m =n gives 
power very close to unity. 


= = = = upper and lower envelopes 


1.00 


Fic. 1. Operating characteristics of the F-test for testing 
0 —0 against 07-0 at the 5% level of significance. 


a 


ONE-CRITERION VARIANCE COMPONENTS ANALYSIS 639 


REFERENCES 


1] Baines, A. H. J., “On economical design of statistical experiments,” (British) 
Ministry of Supply, Advisory Service on Statistical Methods and Quality 
Control, Technical Report, Series R, No. Q.C./R/16, July 15, 1944. 

[2] Crump, S. Lee, “The present status of variance component analysis,” Bio- 
metrics (1951), 1-16. 

3] Eisenhart, Churchill, “The assumptions underlying the analysis of variance,” 
Biometrics, 3 (1947) 1-21. 

4] Eisenhart, Churchill, “Planning and interpreting experiments for comparing 
two standard deviations,” Chapter 8 in Statistical Research Group, Columbia 
University, Techniques of Statistical Analysis, edited by Churchill Eisenhart, 
Millard W. Hastay, and W. Allen Wallis, New York, McGraw-Hill Book 
Co., 1947, 207-314. 

5] Ferris, C. D., Grubbs, F. E., and Weaver, C. L., “Operating characteristics 
for common statistical tests of significance, “Annals of Mathematical Sta- 
tistics, 17 (1946) 178-97. 

[6] Hald, A., “Fractiles of the v’-Distribution,” Table VII in Statistical Tables 

and Formulas, New York, John Wiley and Sons, 1952, 47—59. 


STATISTICAL ABSTRACTS 


All communications concerning this section should be addressed to the 
Abstracts Editor, Professor George E. Nicholson, Jr., Chairman of the De- 
partment of Statistics, University of North Carolina, Chapel Hill, North 


Carolina. 


Anderson, T. W., ^On estimation of param- 
eters in latent structure analysis," Psycho- 
metrika, 19 (1954), 1-10. 

A new computational procedure for esti- 
mating the parameters in latent structure 
analysis is developed. The procedure has 
the advantage of avoiding the use of im- 
plicitly defined and unobservable quantities 
as well as being relatively simple computa- 
tionally. On the other hand the proposed 
procedure has the disadvantage of using 
only part of the available information and 
of using that part asymmetrically. A nu- 
merical example is worked out in detail. 

After reviewing the basic model of latent 
class structure, the author develops the 
rationale for the estimation method sug- 
gested as the basis for computation. The 
method deduces the proportion of the 
population in each latent class, and the 
probabilities of positive responses for each 
individual in the latent classes, from knowl- 
edge of probabilities of positive responses 
Hum individuals in the population as а 
whole. There are several possible choices of 
initial data for estimatiag the same param- 
eters—one may obtain differing estimates 
for the same parameters depending upon 
the particular initial choice made. If the 
latent classes are well defined, however, the 
Tange of the differences between equivalent 
estimates will be relatively small. B. J, 
Winar, University of North Carolina. 


А Bartlett, M. S., and Rajalakshman, D. Y, 
Goodness of fit tests for simultaneous ` 


autoregressive series,” Journal of the Royal 


| Statistical Society, (В) 15 (1953), 107-24. 


Simplified method of derivation of 
Quenouilles (1947) goodness of fit test for 


| autoregressive series with discrete time, 


+++ „ìs extended to the case of simultaneous 
autoregressive series . . . . The general solu- 
tion is applied to a particular first-order 
Process in two variables.” С. L. EDGETT, 
Virginia Polytechnic Institue. 


Beall, Geoffrey, and Rescia, Richard R., “A 
generalization of Neyman’s contagious dis- 


' tributions,” Biomvtrice, 9 (1953), 354-86. 


A class of discrete distributions dependin, 
on three parameters, one of which is called 


n, is constructed. It is shown that Neyman's 
contagious distributions of types A, B, and 
C are members of this class for n=0, 1, 2 
respectively. The calculation of the prob- 
abilities in these distributions require the 
use of recursive relations which are pre- 
sented here. The estimation of the param- 
eter n is a problem which the authors treat 
by fitting the frequency of cases with zero 
occurrences; the remaining two parameters 
are estimated by the method of moments. 
Several examples are diseussed where it is 
shown that values of т other than 0, 1 or 2 
give better fit than is obtained with the 
distributions previously employed. LINCOLN 
Moszs, Stanford University. 


Bechhofer, Robert E., "A single-sample 
multiple decision procedure for ranking 
means of normal populations with known 
variances,” Annals of Mathematical Sta- 
tistics, 25 (1954), 16-39. 

The problem of classifying normal popu- 
lations in two or more groups with respect 
to the ranking of their true means has been 
considered. The lower bounds for the prob- 
ability of correct grouping is obtained by 
considering the least favourable configura- 
tion of the means consistent with the given 
rankings, In the case of two groups, the 
least favourable configuration of the means 
isuh]= ++ - = uf] for the lower group and 
щк—н]= *** = ufa] for the upper group and 
the tables of probabilities of correct group- 
ing have been constructed for values of k 


and ¢ and 
^ Bet] — uli] 


c 


where т is the common standard deviation 
of the populations. These tables could con- 
versely be used to determine the sample 
sizes when one wants to ensure a certain 
lower bound of the probability of correct 
grouping for A>Ao. The method has been 
generalized to two-way classifications when 
the mean of the variable X;; is given by 
u+a;+;, and the experimenter wants to 
Pick up the population with highest о, and 
higher 8. Illustrations of the use of the 
tabs have been given. M..N. Оновн, Uni- 
versity of North Carolina. 


640 


STATISTICAL ABSTRACTS 


Birnbaum, Allan, “Admissible tests for the 
mean of a rectangular distribution,” Annals 
of Mathematical Statistics, 25 (1954), 157-61. 

"The problem of testing for the mean @ of 
a rectangular distribution has been con- 
sidered when the range is known and when 
the loss function is simple, i.e. assumes the 
values zero and one for correct and incor- 
rect decisions respectively. Using the fact 
that the minimum observation u, and the 
maximum observation v are joint sufficient 
statistics, the Bayes solutions are obtained 
in terms of functions of u and v. Using the 
Neyman-Pearson Lemma, the most power- 
ful one-sided and two-sided tests are ob- 
tained from the class of Bayes solutions, 
M. N. бновн, University of North Carolina. 


Box, G. E. P., and May, W. A., “A statisti- 
cal design for efficient removal of trends oc- 
curring in a comparative experiment with 
an application in biological assay," Bio- 
metrics, 9 (1953), 304-19. 

It is desired to compare the dose response 
curves of two biological preparations, A 
trend in time is expected to exist, which 
would ordinarily be confounded with the 
dosage response relations. An example in- 
volving four dose levels, each repeated for 
both drugs is given in full detail. The dose 
levels to be given on the eight occasions 
(equally spaced in time) are so chosen that 
linear and quadratic dose effects, linear, 
quadratic and cubic time trend effects, 
and interactions between drugs and each 
of the five named effects сап all be esti- 
mated and all estimates be mutually 
orthogonal. Since the dose response curves 
turn out as parallel straight lines in the 
example, and the dose metameter is log 
dose, relative potency is estimated and а 
confidence interval is given. The last section 
of the paper deals with the method used to 
choose dose levels yielding the desired 
properties of the design. Lixcor Moses, 
Stanford University. ы 


Chakravarti, N., and Bandyopadhyay, К. S., 
А note on the consumption of cereal per 
adult unit in Calcutta,” Sankhya, 13 (1953), 
215-18, 
) Survey information from a family budget 
inquiry conducted during 1950-51 by the 
State Statistical Bureau, Government of 
West Bengal, utilizing only the data for 
Calcutta, forms the basis of this note. The 
method of least squares is used to estimate 
the consumption of cereal in a given age or 
Sex group of a random sample for the town 
of Calcutta, India. T. S. RussELL, Verginia 
Polytechnic Institute. e 


641 


Chapman, Douglas G., “The estimation of 
biological populations,” Annals of Mathe- 
matical Statistics, 25 (1954), 1-15. 

This paper gives a systematic review of 
the various methods of sampling used for 
estimation of wild population. The mathe- 
matical models and assumptions are ex- 
plicitly stated, which serves a very useful 
purpose of forewarning the consumers of 
these statistical methods about the pitfalls 
in this area. Most of these sampling meth- 
ods depend on tag-recapture technique, 
which could possibly be useful for estimat- 
ing fish population. Other interesting 
methods depending on size of successive 
samples developed by DeLury and others 
are also discussed. М. N. біновн, University 
of North Carolina. 


Dwyer, P. S., “Solution of the personnel 
classification problem with the method of 
optimal regions,” Psychometrika, 19 (1954), 
11-26. 

The mathematical problem involved in 
personnel classification is shown to be a 
special case of the*general mathematical 
problem of linear programming. After 
presenting numerical examples of variations 
in the general problem that arise in the area 
of personnel classification, the author points 
out that equivalent problems are encoun- 
tered in the Hitchcock transportation prob- 
lem, pfoblems that arise in biometric clas- 
sification, and problems encountered if a 
zero-sum two-person game, Essentially the 
classification problem is that of finding co- 
efficients that maximize a linear form, sub- 
ject to а set of linear constraints. 

For most practical purposes special com- 
putational procedures rather than the more 
general mathematical solution seem most 
feasible. The conditions underlying the 
method of optimal regions as developed by, 
+h author are generalizations of those given 
earlier by Brogden. Computationally the 
method is an iterative one which basically 
moves hyperplanes parallel to successive 
positions in such a way that the optimal 
solutions, involving the number of points 
within the resulting regions, eventually 
satisfy the desired quotas. In most practical 
problems encountered so far, the method 
leads to a solution in relatively few itera- 
tions. It is recommended for use for prob- 
Jems in which there are preassigned quotas 
and only a small number of categories. In 
application to a problem in which there 
were 1152 men and 7 job categories, a solu- 
tion was attained after eight iterations. 

"This article provides^an excellent and 
readable summary of work that has been 


642 


done in this area. B. J. Winer, University 
of North Carolina, 


Grubbs, Frank E., and Coon, Helen J., “On 
setting test limits relative to specification 
limits," Industrial Quality Control, 10 
(1954), 15. 

Specification limits for a product should 
not be used as limits for determining the 
acceptability of the product if errors of 
measurement will be made in the testing. 
The paper shows how to determine test 
limits which satisfy several different cri- 
teria. If we let A be the chance of and C4 
the cost of accepting a non conforming 
piece and B be the chance and Св the cost 
of rejecting a conforming piece, then the 
total expected cost of making wrong de- 
cisions із САА -СвВ. 

A general expression for the test limits 
which minimize this expected cost is pre- 
sented and tables of factors which deter- 
mine the limits are given for C4=Cp, 
C4=2Cp, and for А= В. 

A major conclusion of the paper is that in 
most cases the test limits should be outside 
the specification limits unless C4>6Cp. In 
the analysisit is assumed that the product 
quality and the measurement errors are 
normally distributed with known variances, 
ArsBERT Н. Воукив, Stanford University. 


Gumbel, E. J., “The maxima of tlie mean 
"»gest value and of the range," Annals of 
Mathematical Statistics, 25 (1954), 76-84. 

In this article the’ author generalizes 
some earlier results obtained by both 
Plackett and Moriguti, He shows that the 
maximum of the mean range calculated by 
Plackett holds for any continuous variate 
possessing the first two moments. 

The mean and the standard deviation of 
the largest value and the mean range are 
given for a distribution where the mean 


largest value is a maximum, and for another” 


distribution where the mean range is a 
maximum. 

The asymptotic properties of the reduced 
values for the two distributions are com- 
pared, 

_ Graphs for probability and density func- 
tions which maximize the mean largest 
value (n=2,3,4,5) and for the mean 
largest reduced values and mean ranges as 
functions of т are drawn. A. E. SARHAN, 
University of North Carolina. 


асса L., aoe theory for the struc- 
of quantitative variates,” Psycho- 
metrika, 18 (1953), 277-96, 


A new structural model which purpcrts 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


to avoid problems of indeterminancy in- 
herent in the factor analysis model is pre- 
sented. Within a universe of variates, each 
variate is partitioned into a part that is 
predictable by linear multiple regression 
from the other variates (a common part) 
and a part that is not predictable (an alien 
part). Sampling of variates within this uni- 
verse defines partial images for each variate 
(i.e., predicted values from n—1 remaining 
variates); the non-predictable parts are 
called partial anti-images. The author con- 
tends that this partition is unique, whereas 
the factor analysis model provides no 
unique definition for the partitioning of the 
variates. The factor analysis model is said 
to be determinate only if it reduces to the 
image analysis model as the size of the 
sampling from the universe increases. 
Under the partition made by image 
analysis, the correlation between two vari- 
ates can be expressed as the difference be- 
tween the covariance of the partial images 
and the covariance of the partial anti- 
images. Using this identity computational 
procedures for structural analysis of inter- 
relationships are developed. It is pointed 
out that in the special case of determinate 
common factors, non-diagonal covariances 
of partial anti-images must approach zero. 
B. J. Winer, University of North Carolina. 


Harman, H. H., “The square root method 
and multiple group methods of factor 
analysis,” Psychometrika, 19 (1954), 39-55. 

Multiple group methods of factoring cor- 
relation matrices have been used for some- 
what varying purposes by various authors. 
By some it is considered as a convenient 
method for extracting factors, by others 
it is considered as a means for locating 
reference vectors in accordance with a pre- 
determined hypothesis. The author at- 
tempts to integrate the various approaches 
t multiple group factoring methods and 

evelops a systematic notation for these 
methods, 

Phases of the computational procedures 
are shown to be simplified by application of 
the square root method for solving sets of 
simultaneous linear equations. Basically 
the square root m^thod is a computational 
technique for factoring a matrix into two 
triangular matrices. Detailed steps in the 
numerical application of the square root 
method are given. Its application to mul- 
tiple group factor analysis, computation of 
inverses, and regression analysis in general 
are clearly and compactly presented. For 
many "purposes the square root algorithm is 
shown to be superior to the Doolittle al- 


STATISTICAL ABSTRACTS 


gorithm. B. J. WINER, University of North 
Carolina. 


Hartley, H. O., and David, H. A., “Uni- 
versal Bounds for Mean Range and Ex- 
treme Observation,” Annals of Mathemati- 
cal Statistics, 25 (1954), 85-99. 


The range, mean range, and other recent 
techniques of short-cut analysis of variance 
are used in industrial quality control under 
the assumption that the basic distribution 
is normal. For example, an observed range 
may be converted to an unbiased estimate 
of the standard deviation by multiplication 
with a certain constant. The authors con- 
sider to what extent this estimate is biased 
when the basic distribution is not normal. 
The problem of establishing upper bounds 
for E (range) /с has been considered by both 
Plackett and Moriguti. They tabulated 
upper bounds but little was done with the 
lower&bound. The authors show that 
Moriguti's solution, which is confined to 
finding an upper bound for symmetrical dis- 
tributions, applies in general Also, the 
authors derive universal upper and lower 
bounds for the ratio E (range)/o for any 
f(x) in which aSz&b, (a and b are con- 
stants). Universal upper bounds are given 
for E(Xn)/o for the case where X is finite 
and X, is the largest sample element. A. E. 
Sarwan, University of North Carolina. 


Jowett, G. H., and Scott, J. F., “Simple 
graphical techniques for calculating serial 
and spatial correlations and mean semi- 
squared differences,” Journal of the Royal 
Кезш Society, Series B, 15 (1953), 81- 


Two techniques for getting the арргохї- 
mate value of the serial correlation are 
described. The value of the Mean Semi- 
squared Difference (MSSD)=4 Average 
(а:— 214) is determined by a “tracing” 
method and a “transparent scale" methode 
‘The approximate value of the serial correla- 
tion is then [Average (z;—£)?— MSSD]/Av- 
erage (x;—#)2. The graphical methods com- 
pare favorably with direct use of a desk 
calculator. While they are less accurate, 
they are less tedious, can be carried out 
with simple equipment® and the tracing 
method is actually much quicker when 
statistics are required for a large number of 
logs. PauL N. Ѕомевупл, Virginia Poly- 
technic Institute. 


Kamat, A. R., “Some properties of esti- 
mates for the standard deviation b: on 


deviations from the mean and variate diger- 
е 
. 


643 


ences,” Journal of the Royal Statistical 
Society, Series B, 15 (1953), 233-40. 

Estimates of the relative variances of Ўро 
and Z,, are given, where Ўр, and Dpr are 
defined as the unbiased estimates of c? 
based respectively on the pth power of the 
deviation from the mean, and the pth power 
of the absolute values of Ашу, the rth order 
variate differences. The relative variance 
is defined as the square of the coefficient of 
variation (Fisher). Putting Spr=(Zpr)¥?, it 
is shown that for large samples the relative 
variance of Soy is smaller than that of Spr if 
p7*2 and that the relative variance of Ss, is 
less than that of Sir. Some results are given 
for small samples. If the values in the 
sample have a slight linear trend in their 
means, it is shown that Хп and Хз have a 
much smaller bias than Zo or Xz. Further, 
the increase in their variance is much 
smaller. If wpr=Dpr/s? where з= Zo, then 
it is proved that the kth moment about zero 
of wpr is equal to the ratio of the kth mo- 
ments about zero of the numerator and 
denominator of орх. PauL N. SOMERVILLE, 
Virginia Polytechnic Institute. 


Kempthorne, O., and Tischer, R. G., “An 
example of the use of fractional replica- 
tion,” Biometrics, 9 (1953), 295-303. 

It was desired to study the effect of 
various factors on the acceptability of de- 
hydrated corn. The relevant factors selected 
for study in an exploratory experimep* 
were: varieties (8), date of harvesting (4), 
blanching condition 42), temperature of de- 
hydration (2), temperature of storage (2), 
length of storage (2). In addition it was 
considered desirable to use 4 blocks rather 
than a completely randomized design. A 
complete experiment would thus require 
2048-21 plots. Both blocks and varieties 
could be represented as pseudo-factors їп в 
2k dactorial design, so a ł replicate was 

osen which resulted in no 2-factor inter- 
actions being mutually confounded. In 
addition the design had some features of a 
split-split-block design. 

The rationale for selection of the design 
is fully presented and the analysis and re- 
sults are sketched. LiNcorN Moss, Stan- 
ford University. 


Lancaster, Н. O., “A reconciliation of x’, 
considered from metrical and enumerative 
aspects,” Sankhya, 13 (1953), 1-10. 

The relationship between the x* of the 
goodness of fit test and the deviations of the 
moments from their expected values is con- 
"idered by use of orthogonal transforma- 
tions. These orthogonal transformations 


644 


yield alternative proofs of the distribution. 
‘of x? used in the goodness of fit without re- 
quiring the use of Stirling's approximation. 
The method of this paper is a generalization 
of the identification of the x? used in ap- 
proximating to the probability of obtaining 
exactly m successes in m trials with con- 
stant probability with the square of a 
standardized normal deviate derived from 
the consideration of n samples drawn from 

‚ а population where the variable can take 
two values, 0 and 1. Т. S. RUSSELL, Virginia 
Polytechnic Institute. 


Lindley, D. V., "Statistical inference," 
Journal of the Royal Statistical Society, 
Beries B, 15 (1953), 30-76. 

The paper is concerned with analysis of 
experiments and the point of view taken is 
that the purpose of experimentation is to 
enable one to decide between certain 
courses of action, Kolmogorov’s axiomatic 
theory of probability is used and Wald’s 
formulation of the statistical decision prob- 
lem is adopted, but not in full generality. 
Two simplifying assumptions are made, 
namely, 1) decisions on experimentation 
have been made and the only decisions re- 
maining to be made are terminal decisions; 
2) both the class of probability distributions 
and of decisions are finite. Under these 
simplifying assumptions it is established 
that there exists a class of decision func- 
"Unns which is, in some sense, optimum. The 
concept of minimum unlikelihood is in- 
troduced and with it is constructed the 
optimum class of decision functions, Conse- 
quences of previous results are discussed as 
well as what must be specified in order that 
meaning be given to the phrase “the best 
decision function.” Applications of the 
minimum unlikelihood method to some 
common statistical problems are given. 
Of particular interest is the result that X is 
the minimum unlikelihood estimator of the 
mean of a normal population for a wide 
class of weight functions. A discussion of 
the paper is included. Е. 8. McFeszrr, 
Virginia Polytechnic Institute. 


Loevinger, A Gleser, G. C., and DuBois, 
P. H., “Maximizing the discriminating 
power of a multiple-score test,” Psycho- 
metrika, 18 (1953), 309-17. 
., Starting with a heterogeneous pool of 
ius ue frequently encounters the prob- 
constructing subtests in a manner 
that maximizes the correlation of items 
within subtests and minimizes the correla- 
tion between. subtests, A method which 
_ Beeks to maximize the ratio of inter-itam 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


covariance to total variance within each 
subtest is developed for this purpose. This 
ratio is defined to be the saturation of the 
subtest. 

The nucleus for subtest # is formed from 
three items having high intercorrelations, 
In Cycle 1 those items in the pool which, 
when added to the nucleus, lower the 
saturation of the subtest are eliminated 
from further consideration in Cycle 1. Of 
the items that do not lower the saturation, 
that one is selected that maximizes the 
saturation. This process is continued at each 
stage with the augmented nucleus until the 
pool of items is exhausted. The construc- 
tion of subtest 2 starts with a new nucleus 
of three items; Cycle 2 follows the same 
pattern as Cycle 1. Computationally several 
cycles can be carried out simultaneously. 

Maximizing the saturation of a subtest 
drawn from a finite pool of items will not 
necessarily result in a battery having maxi- 
mum discrimination power (minimum be- 
tween subtest correlation). The conditions 
under which maximum discrimination is 
not achieved by the method are given. 
These conditions are not in general over- 
restrictive for practical use of the method. 
B. J. Winer, University of North Carolina. 


Moran, P. A. P., "The random division of. 
an interval—Part III,” Journal of the 
Royal Statistical Society, Series B, 15 (1953), 
77-80. 

This paper is a sequel to two previous 
papers on the subject by the same author. 
The distribution of the sum of the squares 
of the intervals into which the line is 
divided is further considered. The lower 
5 per cent and 1 per cent points of the 
distribution can be found exactly up to 
n=8 and n=9 respectively (nine and ten 
intervals). Beyond about n=20 an ap- 
proximation based on the variance ratio 
distribution is probably adequate. For 
Values of n between 9 and 20 no workable 
formula has been reached. FRANKLIN 8. 
McFzzzrx, Virginia Polytechnic Institute. 


Nath, Pran, “О. C. curve simplified,” 
Sankhya, 13 (1953), 35-38. 

It was first pointed out by Barnard that 
most O, C. Curves for fraction defectives 
were sufficiently well represented by а 
Straight line, if drawn on logarithmic 
probability paper using the logarithmic 
scale for p. In this pape: the author in- 
dicates that in his experience Barnard's 
method has not been satisfactory and he 
presc'ats a method using “harmonic prob- 
abhity paper." Harmonic probability paper 


given. The author asserts that more 
satisfactory results are obtained using his 
method but gives no mathematical ex- 
planation. Leo LxxwcH, Virginia Poly- 
technic Institute. e 


Ottestad, Per, ^On the analysis of variance 
of percentage fractions,” Skandinavisk 
Aktuarietidskrift, 35 (1952), 152-59. 
This paper discusses weighting methods 
_ for the analysis of variance of percentages 
based on possibly unequal numbers of 
trials. Random variation of the binomial 
parameter and of the numbers of trials are 
taken into account. After some general 
comments on the meanings and limitations, 
of the process, the author develops a re- 
‘gression method for obtaining the weights, 
- Particular emphasis is given to the case in 
which number of trials is a random variable 
_independent of the binomial parameter but 
dependent on classification. In this case he 
“suggests that the weights should be found 
by estimating the (linear) regression of 
_ class variance for percentage as a function 
of the expected value of the reciprocal of 
sample size; the weights are then taken as 
the reciprocals of the appropriate values of 
‘the estimated regression function. An ex- 
ample taken from Cochran’s article on the 
Same topic in the Journal of the American 
Statistical Association 38 (1943), pp. 287— 
301, is worked out in detail. Finally the 
method is extended to cover the case of 
multiple classifications. SEYMOUR SUDMAN, 
University of Chicago. 


Peto, S., “A dose response equation for the 
invasion of micro-organisms,” Biometrics, 


Where a dose can be represented as an 
integral number of units (say invading 
micro-organisms) and response is a y: - 
event such as death, a one parameter 
model can be offered—viz. probability of 
Survival equals €^"? where p is the unknown 
_ Paranieter. This can be offered as an ap- 
-proximation to (1— р)” where p is small and 
=p)" can be regarded as the probability 
of surviving the (independent) onslaughts 
_ of n organisms each having probability p of 
killing the host. Estimation by maximum 
“likelihood is illustrated and tables to 
facilitate the (iterative) solution are given. 
i The model is ‘compared with the probit 
~ Model and it is concluded that it is a suit- 
able alternative for many problems. Effi- 
lent choice of dosage is shown mean 
Meentrating the doses in the 1095-9596 
e 


1 : | 645 
survivors range. LINCOLN Moses, Stanford 
University. 


Reid, A. T., “On stochastic processes in 
biology,” Biometrics, 9 (1953), 275-89. 
There are many diverse fields of enquiry 
in the domain of biology where processes or 
mechanisms can best be considered in 
probabilistic terms, Examples arise in 
^geneties, epidemiology, spread of rumors, 
organization and function of the nervous 
system, migration of organisms, population 
growth, experimental carcinogenesis. This 
paper considers illustrative problems which 
have been dealt with as stochastic processes. 
Unsolved problems are pointed out. The 
bibliography covers a large amount of re- 
cent work in this field. LiNcor Moses, 
Stanford. University. 


Rosenbaum, S., "Tables for a Nonpara- 
metric Test of Location," Annals of Mathe- 
matical Statistics, 25 (1954), 146-50. 

To test whether two samples of n points 
and m points come from the same popula- 
tion, one counts the number of points, s, in 
one sample which lie outside an extreme 
value of the other sample. 

Tables are constructed to show the prob- 
ability is less than 1% (or 5%) that в or 
more points of a sample of size m (50) lies 
outside an end point of a sample of size n 
(S50), provided the samples have been 
drawn randomly from the same population 
irrespective of its distribution. The author 
used some formulds given earlier by 8. S. 
Wilks in a paper on tolerance limits, A. E. 
Sarwan, University of North Carolina. 


Roy, P. M., “A note on the unreduced 
balanced incomplete block designs,” Sank- 
hya, 13 (1983), 11-16. 

4% is stated that an arrangement of or 
units of v varieties, r units of each variety, 
in b blocks of size k (k<v) is known as a 
“Balanced Incomplete Block Design" if (i) a 
variety appears only once in a block and if 
(ii) pairs of varieties each appear in.» 
blocks. It is necessary that bk —vr, \(v—1) 
=r(k—1), and b2». A reduced form Segue 
when b, т, ^ have no common factor. The 
object of the paper was to investigate (i) _ 
what are the unreduced designs, (ii) their 
connection with finite geometries, (iii) their 
connection with theorems of the method of 
differences, and (iv) whether they are 
capable of presentation in resolvable and 
affine forms suitable for the recovery of 
«inter-block information.*The only possibly 

resolvable forms are shown to be those de- 

signs where »=2(¢+1), b= (2t--1) (+1), 


646 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


r=2t+1, k=2, \=1. R. A. BRADLEY, Vir- 
ginta Polytechnic Institute. 


Sargan, J. D., “An approximate treatment 
of the properties of the correlogram and 
periodogram,” Journal of the Royal Statisti- 
cal Society, Series B, 15 (1953), 140-52. 

It is shown that the correlogram of an 
auto-regressive series can be regarded as 
though it were derived from an auto- 
regressive equation of double the order of 
the equation generating the original series. 
Some properties of the periodograms of 
time series generated by linear stochastic 
equations are investigated, and the results 
are applied to the study of Beveridge’s 
wheat price index. G. І. Eparrr, Virginia 
Polytechnic Institute. 


Sundrum, R. M., “The Power of Wilcoxon’s 
2-Sample Test,” Journal of the Royal Sta- 


un» 


tistical Society, Series B, 15 (1953), 246-54. 


Mann and Whitney's form of Wilcoxon's 
2-sample test is described, and the variance 
of the statistic U under the null hypothesis 
is given. The upper bound of the variance of 
U is derived, and a population is con- 
structed in which this upper bound is at- 
tained. Using the assumption that the 
statistic U is normally distributed under 
both the null and the one sided alternative, 
the power of the test is found for both 
normally distributed and uniformly dis- 
tributed variates. The power of the test 
under the normal case is compared with the 
power of Student's ¢, and the power under 
the rectangular alternative is compared 
with the power of a test based on differ- 
ences of mid-ranges and a test based on 
differences of sample means. H. C. SwEENY, 
Virginia Polytechnic Institute. 


BOOK REVIEWS 


Design for Decision. Irwin D. J. Bross. New York: The Macmillan Co., 1953. 
Pp. viii, 276. $4.25. n 


D. V. LixprEy, University of Cambridge 


«Hv do YOU make decisions? Are you unwittingly letting emotion con- 
ceal essential facts? Are you sure of making the right interpretations 
of facts and figures?" *How to make decisions that PAY. You'll learn new, 
more effective techniques for reaching the best decisions on questions of all 
kinds in— Design for Decision.” “The methods pay. Today a large University 
is paying a Ph.D, on its research staff more than its football coach—because 
of his decision-making skill.” “Be the man in demand—get a copy of... ." 
And so on, and so on, The quotations are taken from the book-jacket of, or 
advertisements for, the book under review. It is a pity that in making a 
serious attempt to write popular science Bross should have been so ill-served 
by the blurb writers, and one can only query whether they weren’t paid more 
than Bross for producing this rubbish. 

In fact Bross has tried to give an account, as far as possible in non- 
mathematical language, of some of the ideas of modern statistical decision 
theory and statistical methods. Presumably the book is intended for the in- 
telligent layman who wishes to have some idea of what these much-maligned 
statisticians do, and if so it succeeds reasonably well in giving a general 
impression, though on points of detail it falls very wide of the mark. The 
style is lively and, as far as I can judge, pure American. The blurb is wrdfig 
again in describing it as plain English. I found it most enjoyable. “[An] out- 
line of history, from Ooze to Oak Ridge,” “Statistical Inference. How to be 
a Great Detective in one easy lesson” are two delightful examples. Despite 
this style the author never makes the mistakes of the blurb writer. 

The book falls naturally into two parts. The first begins with a brief ac- 
count of the central role decision-making plays in man’s existence, discusses 
the nature of probability, and the need for a value system in making de- 
cisions, and concludes with suggested rules for making decisions and some 
examples of their use. The second part contains an account of some modern 
statistical ideas, including a detailed account of the basic ideas involved in 
testing hypotheses and a briefer mention of other statistical tools. A supple- 
ment gives suggestions for further reading. Probability to Bross is all embrac- 
ing; that is, we can speak of probabilities of hypotheses and use Bayes’s theo- 
rem. (Minimax, as understood by him, only includes maximization over a re- 
stricted class of a priori distributions, namely the reasonable ones.) Value 
systems are essential and pragmatism is the philosophy of life. When de- 
cision rules afe introduced they are not presented in the usual way with, on 
the one hand, the “states of the world” and, on the other, the possible de- 
cisions, in the way they usuallf occur in statistical problems, but with the 
decisions contrasted With the possible outcomes. This would make the appli- 


647 


\ 


648 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


cation of the rules to statistical inference difficult if it were not for the fact 
that Bross never seems to realize that statistical decision theory has anything 
to do with, for example, testing the mean of a normal population when ths 
standard deviation is known! In fact the second half of the book is inde- 
pendent of the first half, apart from the discussion on probability. This is 
quite the most astonishing thing about the book. Whilst on p. 83 we are told 
that “the methods proposed by R. A. Fisher to replace Bayes rule also make 
assumptions about the [prior probabilities]... апа’ the substitutes for 
Bayes rule actually represent special cases of the rule,” we are later (Chap- 
ter 13) presented with a Fisherian argument for the test of normal mean 
based on significance levels without a mention of Bayes’s theorem! Similar 
‘remarks apply to the value system, except for some observations on a Simple 
Value System (p. 101) which are not explored. It is amazing that he should 
not realize that a logical pursuit of the decision function method (either with 
or without prior probabilities) leads to a rejection of the significance level 
approach: in general if weights are assigned without consideration of the 
sample size, as they usually would be, then the significance level (i.e. the 
probability of error) will vary with this size. Consequently the second half of 
the book is not merely independent of, but is in contradiction with, the first 
part, There is even more to it than that, for some of the paradoxes of stand- 
ard statistical methods are quoted without it being realized that they are 
resolved by Wald’s ideas. Thus (p. 209) “One crusader against the myth of 
normality has a standard offer of $100.00 for any collection of data with over 
one thousand observations which will meet the standard statistical tests of 
mormality. So far as I know he has had no takers.” Decision theory—balancing 
one hypothesis against another—resolves this criticism of statistical tests. 
Nevertheless the second part of the book does contain a lively, reasonably 
accurate, and always interesting account of some statistical ideas. This is 
possible because, so far as one can see, practicing statisticians, that is people 
who handle data, not the mathematical manipulators of uninterpreted 
symbols, do not use decision theory ideas. Significance levels are still used 
_ despite Wald. To discuss why thi is so would take us too far away from the 
book under review. Any statistician wishing to recommend this book to his 
friends should therefore warn them to take a good pinch of salt with the first 
part. Even the second half is not free from blemish for there is one grand 
howler. The test chosen by Bross as an example is that for the mean of a nor- 
mal population, standard deviation known. The description is lucid and it is 
в great pity that it is spoiled by the use of the sum of squares as the test cri- 


terion and reference to the x*-table for its significance! (p. 229). This appears 
to have arisen through a misunderstandin 


of th { he term 
“sufficient statistic.” i eal 
There are many other things one might comment on, for this is a stimu- 
lating book. Several remarks are sheer nonsense: “Our standard for bias is 
based on the rule that whenever an overwhelming majority of observers are 


en CON 


BOOK REVIEWS 649 


in agreement they are, ipso facto, right" (p. 152). Others are very sensible— 
those on the importance of a value system, for example. Other remarks are 
contradictory: “if this outcome or result of the decision process is agreeable 
to me, then the decision may be adjudged satisfactory” (p. 20), but “...a 
definite and objective way to progress from data to inference . . . all arrive at 
the same conclusion provided they start from the same data” (p. 221). (In 
both cases my italics.) What does, however, annoy me is Bross’s assurance, 
He states everything'as though there were no doubt at all. Pragmatism is 
right. Probabilities of hypotheses are all right. He never gives a hint that 
other people might hold different views and yet be sound. The world is es- 
sentially simple to him: his blacks are jet, his whites are bright and there are 
no greys. I envy him. But “I beseech you, in the bowels of Christ, think it 
possible, you may be mistaken.” 

Here then is a book which gives overall a good account of statistical ideas 
for the layman, a less satisfactory account of decision theory, but which is 
always entertaining. It contains a magnificent misprint. In connection with 
the Renaissance, the renewed interest in Greek ideas, and the-dawn of the 
experimental method, we read: “The same era that witnessed the rediscovery 
of Reason also saw the birth of the successor to Reason—Silence.” 


The Design and Analysis of Experiment. M. Н. Quenouille. New York: Hafner 
Publishing Co., 1953. Pp. xiii, 356. $7.50. 


Bernarp OsTLE, Montana State College 


T appearance of four books on experimental design in the last few years 
is evidence that authors have become aware of a gap which has existed in 
the literature for too long. The four books are: Experimental Designs by 
Cochran and Cox, Design and Analysis of Experiments by Kempthorne 
Analysis and Design of Experiments by Mann, and the volume reviewed here, 
The four authors had different goals. Mann’s text,is a purely mathematical | 
approach to general linear hypothesis theory. Cochran and Cox give a catalog 
of useful designs, accompanied by considérable explanatory material which 
makes their book most helpful to zeseürch workers. Kempthorne’s work is 
conceived on a larger scale: he successfully attempts а logical development 
of designs (mainly factorials) and design principles, and thus has produced 
a book of an advanced nature more suitable for graduate study than for re- 
search use. Quenouille’s book is most nearly akin to that by Cochran and 
Cox, but gives greater emphasis to groups of experiments and long-term 
Policy. 

The Design and Analysis of Experiment has been divided into four sections 
which, with the topics included in each, are: (A) Elementary Principles and 
Designs: (1) The design and analysis of experiments, (2) Randomised blocks. 
and Latin squares, (3) Factorial and. split-plot designs; (B) Incomplete 
Block Designs: (1) Factorial designs involving factors at two or three levels, 

° 


650 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


(2) Complex factorial designs, (3) Incomplete block designs for a single set 
of treatments; (C) Long-Term Policy: (1) Long-term experiments, (2) 
Planning of groups of experiments, (3) Combinations of experimental re- 
sults; (D) Experimental Complications: (1) Special designs and analyses, 
(2) Missing observations, (3) Sealing of observations. 

In the preface, the author states: “This book is aimed at those wishing to 
acquire a working knowledge of experimental design and an understanding 
of the principles governing it.” Unfortunately, the explanations are often too 
brief and too technical to be of great value to research workers not well- 
versed in statistical theory and principles. In particular, this is not a book for 
persons lacking a solid background in analysis of variance. The constant 
stress on the assumptions necessary for a valid use of the various designs is, 
however, an excellent feature. Too often this important part of experimental 
planning is omitted from classroom lectures and text book chapters. 

Tn section B, Quenouille describes the main types of confounding. The use 
of partial confounding is discouraged because of complexities of computation. 
While it is true that partial confounding does add to the time involved in 
calculation, this extra effort is frequently worthwhile. This apparent con- 
demnation of partial confounding is characteristic of Quenouille’s tendency 
to use sweeping statements which may lead to the non-critical reader to 
lose the benefits of certain advanced technqiues. 

The most interesting material in this book is contained in Sections C and 
D, especially under the heading Long-Term Policy. The discussion on plan- 
ning of groups of experiments and combining experimental results is excellent. 

n the other hand, the reviewer wonders why “missing observations” were 
singled out for special attention. This subject could easily have been handled 
as a necessary consideration when discussing each particular design. Also, 
rotation experiments surely deserve more attention than the four pages de- 
voted to them. 

In summary, much useful information is contained in the book. However, 
the tendency to be brief and technical, accompanied by an occasional careless- 
ness in writing (due, no doubt, to attempting to complete the work within 
too short a period of time), has detracted from the value of the book. It is 
not to be read quickly. Nevertheless, experienced research workers in the 


agricultural and biological sciences will find it a good reference text if read 
critically. 


Sample Survey Methods and Theory. Vol. 
Vol. II. Theory. Morris H. Hansen, iliam 1 
New York: John Wiley & Sons, 


ту. I. Methods and Applications; 
William N. Hurwitz and William G. Madow. 
Inc., 1953. Pp. xxii, 638; xiii, 322. $8.00; $7.00. 


Tore DarzNrvs, Stockholm 


Toe The campaigns conducted ‘during the first quarter of this 
century for an increased use of “the representative method,” supported 


7 


BOOK REVIEWS 651 


by the International Statistical Institute among others, showed special suc- 
cess in the latter part of the 1930’s. The lead in the newera of sampling was 
taken by India and the United Kingdom, where the development naturally 
was knitted to improvements i in the field of agricultural statistics, and by the 
United States, where a great portion of the development was knitted to im- 
provements of methods for measuring socio-economic phenomena. 

The developments of 1939 accelerated this trend. In the U. S., the Bureau 
of the Census took thé lead. The Census Bureau introduced sampling meth- 
ods into the 1940 census to an extent not previously seen, and transferred 
many sample surveys carried out by means of non-probability methods to a 
probability basis. This development was responsible for the creation of & 
large and competent *sampling staff" within the Census Bureau. Among the 
members of this staff were Morris H. Hansen, William N. Hurwitz and Wil- 
liam G. Madow, authors of the two volume work Sample Survey Methods and 
Theory, published as one of the Wiley Publications in Statistics. 

A large amount of the work carried out by Hansen-Hurwitz-Madow and 
their colleagues necessarily meant application of already available theory to 
actual survey operations. But a considerable portion of the activity was 
devoted to the development of new theory and new methods. 

Portions of the results thus achieved have been presented earlier; examples 
are the book A Chapter in Population Sampling, the almost classical 1943 
paper entitled “On the Theory of Sampling from Finite Populations” and 
the 1949 paper “On the Determination of Optimum. Probabilities in Sam- 
pling.” 

Objects of the book. Sample Survey Methods and Theory represents “an @- 
tempt to give a comprehensive presentation of both sampling theory and 
practice.” The book as a whole, as well as each one of the two separate vol- 
umes, is designed as a textbook; it should, moreover, serve as a manual for 
the investigator engaged in the design of sample surveys. Finally, parts of 
the book are intended for “the user of the results of surveys who wishes to 
know the circumstances under which he may place “confidence in information 
based on samples.” 

Broad summary of content. As indigatetl by the subtitles of the two volumes, 
Volume I is devoted to applications (it is labelled, by Wiley, “applied sta- 
tistics”); Volume II is devoted to theory (labelled, by Wiley, “mathematical 
statistics”). 

Volume I may be looked upon as made up by three parts; the introduction 
and chapters 1-3 make up the first part, chapters 4-11 the second part, and 
chapter 12 the third part. 

The introduction and chapters 1-3 address themselves to the consumer of 
Survey results rather than to the producer. The language is “nonmathema- 
tical”; this does not mean a complete lack of formulas and symbols but a 
frequent use of easily-grasped illustrations. In addition to presenting the 
usual sampling principles, the ffs part presents the philosophy of the use of 
measurable sample ѕўгүеу designs. 


652 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


Chapters 4-11 present a detailed account of methods for sampling from 
finite populations. Most results are given with references to proofs in Volume 
II. The following list of chapter headings gives an idea of the contents: 4, 
Simple random sampling; 5, Stratified random sampling; 6, Simple one- or 
two-stage cluster sampling; 7, Stratified single- or multi-stage cluster sam- 
pling; 8, Control of variation in size of cluster in estimating totals, averages, 
or ratios; 9, Multi-stage sampling with large primary sampling units; 10, 
Estimating variances; 11, Regression estimates, double sampling, sampling 
for time series, systematic sampling, and other sampling methods. 

Chapter 12, “Case Studies,” constitutes the third part. In the first half of 
this 110 page chapter, three sample surveys from the practice of the Census 
Bureau are presented in detail in a way which demonstrates how the many 
methods presented in chapters 4-11 are integrated into a survey design. The 
rest of chapter 12 deals with two studies of variances and co-variances and 
the use of quality control methods in the office processing of the 1950 
Censuses of Population, Housing, and Agriculture. 

Likewise, Volume II may be looked upon as made up of three parts; 
chapters 1-3 make up the first part, chapters 4-11 the second part and chap- 
ter 12 the third part. In chapter 1, the fundamental definitions used in Vol- 
ume I are summarized; by this device, used throughout Volume IJ, this 
volume is made self-contained; it is possible to read Volume II before Volume 

“I, or even to read only Volume-II. Chapters 2-3 present the fundamental 
theorems on probability, expected values and variances necessary for master- 
ing the proofs given in the following chapters. 

Chapters 4-11 are mainly devoted to proofs of the results quoted in Vol- 
ume I. The presentation runs, chapter by chapter, parallel in the two vol- 
umes. However, in addition to the empirical results presented in Volume I, 
there are new ones scattered in these chapters. 

; Chapter 12, finally, presents a theory of response errors; the chapter is a 
revision of a paper published in this Journal in 1951. 

An effort to evaluate the book. I am of the opinion that a review, in order to 
be comprehensive, should end up with an effort to evaluate the book against 

| the background of the objectives set forth by the authors. 


The book їз a “comprehensive presentation of both sampling theory and . 


practice.” As to theory (for sampling from finite populations), I am at a loss 
to find anything of importance that has been left out; as to practice, it is 
Vue that most applications are selected from the work within the Census 
Bureau but there seems to be no important type of finite population en- 
tirely left out. / 

I have not had experience with this book as a textbook ; I have had, how- 
ever, the opportunity of attending a course, in Washington in 1951, which 
was based on notes having a story in common with this book. From this ex- 
perience, Г conclude that the book will be found to be an excellent textbook. 

There їз а specific feature of the book which deserves to be mentioned. By 
splitting the book into two volumes, one “applied” and one “theoretical,” 


д 


BOOK REVIEWS 653 


the authors have been able to present the solution of a design problem in 
different ways in the two volumes. In Volume I, the authors stress just as 
much the region (interval or whatever it may be) within which the exact 
mathematical solution is to be found, as the exact solution itself; thus, the 
authors stress that “the optimum is broad” when discussing, for example, 
optimum allocation in stratified sampling and optimum size of cluster. 
Solutions of this kind are sorely needed in actual work where one almost 
always has to design without exact information as to the size of important 
“design parameters” (such as variances). In volume II, on the other hand, 
emphasis is on the exact solutions. Volume II thus teaches the technique to 
use on one’s own problems. т 

Space does not permit a detailed. discussion of other valuable aspects of 
this book. I only want to mention that I welcome especially the thorough 
analysis of survey costs and the construction of cost functions. There are, 
in official survey reports from all over the world, many examples of cost 
functions; but these examples are often difficult to interpret and use as long 
as there are only, at the very best, indications of what kind of costs are taken 
care of by the different components in the cost functions. 

In summary, this is a great book, which will be indispensable to every per- 
son, statistician or not, who comes close to sample surveys. Of course, in a 
book of nearly 1,000 pages, one can find opportunities to criticize some 
points. Most of them are too minor to be dealt with at length (e.g., the def- 
inition of simple random sampling, chapter 4, does not seem to fit the one 
in chapter 5; I would prefer to see a distinction between a “ratio estimate” 
and an “estimate of a ratio,” and so on). But it is perhaps justifiable to sey 
that chapter 11 is somewhat displaced and too “mixed.” Personally, I would 
father see one separate chapter devoted to systematic sampling, possibly 
placed before the present chapter 6. The difference and regression estimates 
could be discussed in exactly the same way as are the ratio estimates (i.e., 
in conjunction with the specific sampling systems, as means of improving pre- 
cision over and above that of simple estimates süch as the sample mean). 
[Note: Since chapters numbers and content are parallel in the two &6lumes, 
the foregoing reference to chapters are for both volumes.] 

The language in the book is, even for a foreigner, easy; one soon gets used 
to words like “rel-variance,” “epsem,” ete. But the symbols are part of the 
language. The lack of a standard in this area is in itself regrettable and should, 
I think, be taken care of. As a result of this lack, Wiley has published in — 
1953 two excellent bpoks in sampling, this and Cochran’s, which use rather 
different symbols. 

I hate to finish my review of this excellent book by being critical, so I 
repeat: Sampling Survey Methods and Theory is a great book, which will be 
Indispensable'to every person, statistician or not, who comes close to sample 
Surveys, 


. в 


e 
19.1 т. 


654 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


Elementary Statistical Analysis. Harry P. Hartkemeier. Dubuque, Iowa: Wm. 
C. Brown Co., 1952. Pp. xxii, 484. $6.00. Paper. 


Асневох J. Duncan, The Johns Hopkins University 


mIs book by Professor Hartkemeier is the text that is used in the fresh- 

man course in elementary statistics at the University of Missouri at which 
he is Director of the Statisties Laboratory. It assumes no prerequisite of 
college mathematics and is slanted toward the student of business. In the 
author's own words, it “has been written for the person who likes to have 
directions for the immediate practical application of the elementary statis- 
tical techniques to sample data without waiting until all statistical techniques 
and the mathematical theory underlying them have been explained in de- 
tail.” 

In his twenty odd years of teaching statistics Professor Hartkemeier has 
developed many novel ideas as to how a beginning course in the subject 
should be taught and has as a consequence written a book that is strikingly 
different from other statistics texts. This is true with respect to both format 
and contents, The pages are full size reproductions of typewritten copy, the 
cover is heavy paper and the whole is fastened together with plastic screws 
that allow withdrawal of pages at will. This loose leaf character of the text 
permits the author to include special problem forms and work sheets that 
upon completion may be extracted by the student for submission to the 
instructor. It also permits extraction of tables at time of examination without 
running the danger of recourse to forbidden sections of the text. Illustrations, 
tables, and problem forms are placed at the ends of the chapters thereby 
permitting ready reference, 

The book is conterned primarily with tests of significance for small sam- 
ples. After an “introductory” chapter in which the reader is given the ele- 
ments of frequency distributions, time series analysis, index numbers and 
correlation (all in 35 pages), the author settles down to a detailed discussion 
of the computation of Square roots and use of mathematical tables, the com- 
putatien and use of the arithmetic mean and standard deviation and other 
types of averages and representative values, tests of significance for means 
and standard deviations and comparisoas of two means and standard devia- 
tions, contingency tables and chi-square tests, and a discussion of analysis of 
variance that includes problems related to both single and two way classi- 
fications, unequal numbers in different cells, Latin squares, and data with 
several Sources of random error. There is also a chapter on statistical quality 
control in manufacturing operations, and a chapter on Computing procedures 
and machines, drawn heavily from the author’s book Punch-Card Methods 
and illustrated copiously with pictures (supplemented with directions for 
in of many of the modern computing machines, including even the Monroe 
Am кн All this material is presented in a very readable 

d at tho reviewer believes beginning students will like. At the end of 
each chapter there are numerous problemsand considerable problem data. 


BOOK REVIEWS 655 


In writing a beginning book that attempts to explain the principles of 
statistical theory without the use of mathematics, the greatest difficulty is 
to prevent the argument from becoming inexact and perhaps incorrect. 
Although Professor Hartkemeier does a splendid job on the whole of explain- 
ing difficult material, he has ñot succeeded entirely in avoiding misleading 
statements and errors. The principal instances of this kind noted by the 
reviewer are as follows: 

(1) Charts of frequency curves in the book have a vertical scale marked 
“number of” or “relative number of” cases, whereas in fact it is the area 
under the curve that is a measure of the “relative number of” cases and not 
the ordinate of the curve. 

(2) The text suggests that the mean has little use in a highly skewed dis- 
tribution. There are cases, however, in which it may be the “best” average 
for certain purposes. If we know the mean family income, for example, and 
the number of families in a given community, we can compute the com- 
munity income. We cannot do this with the mode or the median. 

(3) The text tends to give a false conception regarding the character and 
use of the t-distribution. The impression is given that the ratio of a variable 
to its standard error follows the t-distribution if the sample from which the 
variable is calculated is small. Thus, on p. 372, the ratio of a mean of a 
sample of 25 to a known standard error is treated as a t variable, Actually it 
is a normal variable. Again, on p. 373, the ratio (т sample — с universe)/known 
standard error of c is treated as if it had the t-distrivution (apparently be- 
cause the sample size is 25) whereas, it actually is distributed as x or more 
precisely as 4/2(x — VN). А, 

The facts about the t-distribution are as follows: If z is a normally distrib- 
uted variable with zero mean and unit standard deviation, if u is a variable 
that follows the x? distribution with n degrees of freedom, and if z and u 
are independently distributed, then the ratio z//u/n has the t distribution 
with n degrees of freedom. The t distribution approaches the normal distri- 
bution as the degrees of freedom become infinitely large. 2 

(4) The text fails to point out (p. 334) that the use of the ordinary x? 
table of probabilities in x*-tests of frequencies involves an approximation. 
This is essentially the same in character as the approximation of a binomial 
probability by an area under a normal curve. It is the reason for the x? 
шоп for continuity in a 2X2 contingency table when samples are not 
large, $ 

(5) The title of Chapter 9, “Statistical Methods Necessary for Quality 
Control in Manufacturing Operations,” is misleading. A better one would 
have been “The Author’s Ideas on Some Quality Control Procedures.” 
Although Professor Hartkemeier refers to several books on quality control, 
including one by the reviewer, he draws little from them. Instead he pro- 
ceeds to describe methods that@liffer widely from the standard procedures. 
Thus, his X-chart uses samples of 25, noé the customary samples of 4 or 5. 

. 


656 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


The central line on the chart is apparently based on specifications, in the 
manner of a modified X-chart. The lower limits on the chart are 2.5% and 
10.5% probability limits incorrectly based on the ¢-distribution, not the ordi- 
nary З с limit. The upper limits are statistical limits but are confusedly inter- 
‘preted as being related to some specification based on the desire not to pro- 
duce too good а product. The discussion emphasizes the use of the X-chart 
to maintain a constant allowable fraction defective, but says little about the 
use of the chart in discovering assignable causes (the principal use of the 
control chart). 

The chart that is suggested for controlling variability is the little-used 
standard deviation chart. Again it is related to the attempt to maintain a 
constant fraction defective rather than to the attempt to detect assignable 
causes. Strangely (and incorrectly), limits are again set by a t factor. No 
mention is made of the range chart, although this is used in industry many 

: more times than the standard deviation chart. The statement made earlier 
in the text (pp. 117, 212) that the range is not used much by statisticians is 
very much in error in the industrial field. 

In any course in which this book is used, the reviewer would strongly urge 
the omission of this chapter on statistical quality control. Fortunately, such 
an omission can be made without difficulty. 

(6) The text does not state the basic assumptions of analysis of variance 
(additivity of main effects and normality, independence, and uniform vari- 
ability of the random variable). This omission is somewhat unfortunate in 
that the text is free and easy in its illustrations of the uses of analysis of vari- 
aie. It is applied to percentage data (pp. 441, 451) where the assumption of 
uniform variability, is probably violated; and it is applied to sales data (p. 
443) where the assumption of additivity of main effects is questionable. It 
may be admitted that the F-test is “robust” and may not be seriously af- 
fected by such deviations from strictly valid procedures. The use of such illus- 

trations, however, coupled with failure to state the assumptions underlying 
strictl , valid procedures may lead the reader to believe that there are no 
limitations to the application of analysis of variance techniques. 

More serious than these deviations from the basic assumptions of analysis 
of variance is the tendency to play around with the data until some signifi- 
cant conclusion is reached with little regard to the effect of this procedure 
upon the final level of significance. Thus, t-tests are run after an F-test shows 

_ non-significance (р. 398), and data are reclassified (p. 399) and tested again 
after a first F-test shows non-significance. r 
"Тһе use of charts to show interactions is an excellent device, but to inter- 
pret an interaction as the “crossing” of the movements at various levels 
Coe P PAES а of interaction is non-similarity of 
dij eee ing of the interaction graphs. 
tation of a row sum of squares (p. 394) when there are actually 


not rows but merely an equal number of cases i i i 
dd number of cases in each class is very confusing 


i 
, 


ae 


| 
4 


BOOK REVIEWS 7 3 657 


These criticisms should not deter an instructor well trained in statistics 
from using the text for a beginning course for, as noted above, it will probably 
be liked by the students and this is important. A greater drawback in the 
eyes of the reviewer is the little attention given in the book to confidence 
intervals, errors of the second’kind, and correlation. 


Mathematics and Statistics for Economists. Gerhard Tintner. New York: Rine- 
hart and Company, Inc., 1953. Pp. xiv, 363. $6.50. 


G. Barer Price, University of Kansas 


d pe standard mathematics curriculum in American colleges and universi- 
ties is one which has grown up in connection with the physical sciences— 
one which has been designed to support the study of chemistry, physics, 
and the engineering sciences. The traditional sequence of courses—college 
algebra, trigonometry, analytic geometry, calculus, and differential equa- 
tions—is badly out of date because it has undergone no fundamental revision 
in fifty years, and perhaps a hundred. 

There is abundant evidence that major changes are in progress. The 
Mathematical Association of America and the National Council of Teachers 
of Mathematics sponsored a Conference on Teacher Training at the Uni- 
versity of Wisconsin in the summer of 1952. As an outgrowth of this confer- 
ence and of the activities of the National Research Council’s Committee on 
the Regional Development of Mathematics, two committees have been ap- 
pointed to study the revision of the undergraduate curriculum. One is a joint 
committee of the MAA and the NCTM under the chairmanship of Dr. CV. 
Newsom, and the other is a committee of the MAA under the chairmanship 
of Professor W. L. Duren, Jr. The National Science Foundation sponsored 
a Summer Conference on Collegiate Mathematics at the University of Colo- 
rado in the summer of 1953, and it will sponsor two similar conferences, at 
the University of Oregon and the University of North Carolina, in the sum- 
mer of 1954. The National Science Foundation will also Sponsor & conference 
for high school teachers of mathematics is the summer of 1954. All of these 
activities are designed to modernize the mathematics curriculum and its 
teachers, : 

Professor Tintner’s book on Mathematics and Statistics for Economists 
must be considered further evidence of the change that is taking place in 
mathematics, especially in its relation to the social and biological sciences. 
This book was writgen to teach mathematics and statistics to economists, 
especially to future econometricians. It was written for students who know 
economics, and who have some knowledge of algebra and trigonometry. 

The book is divided into three parts. Part I covers pages 3 to 65. The chap- 
ter headings fn this part are: functions and graphs; linear equations in one 
unknown; systems of linear equations; guadratic equations in опе unknown; 
logarithms; progressions; detétminants; and linear difference equations 
with constant coefficients. It might be supposed from these chapter headings 


: * MC 


658 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


that Part I is a brief college algebra, but this is not the case: it is entitled 
Some Applications of Elementary Mathematics to Economics. In the first 
place, the treatment of algebra is far less extensive than the chapter titles 
would suggest. The chapter on the quadratic equation in one unknown con- 
tains a half page of discussion, in which the forinula for the roots of the quad- 
ratic is stated, and a half page of exercises. The chapter on logarithms con- 
tains no treatment of the properties of logarithms and their applications to 
computation. Instead, a knowledge of logarithms is assumed, and applica- 
tions are made to two problems in economics. The chapter on determinants 
is marked as one that is not needed for the remainder of the book and can be 
omitted. On the other hand, a chapter on linear difference equations is in- 
cluded, and it is not marked as one that can be omitted. Thus the algebra 
content of Part I is far less than that of the typical course in college algebra. 
In the second place, Part I contains a large amount of economics. Indeed, 
it contains a treatment of the following topics from economics: linear pro- 
gramming; linear-supply functions; linear-demand functions; market equi- 
librium; market equilibrium for several commodities; imputation; quadratic 
demand and supply curves; Pareto distribution of income; demand curves 
with constant elasticity; growth of enterprise; population theory of Malthus; 
and compound interest. 

Part II, entitled Calculus, covers pages 69 to 190 and contains a fairly ex- 
tensive treatment of differential calculus. One chapter of 13 pages is devoted 
to a treatment of integral calculus. This chapter states the fundamental 
theorem of calculus, but it does not suggest a proof. The extent of the treat- 
munt of calculus is well indicated by the chapter headings in Part II: func- 
tions, limits, and derivatives; rules of differentiation; derivatives of logarith- 
mic and exponential functions; economic applications of the derivative; 
additional applications of derivatives; higher derivatives; maxima and mini- 
ma in one variable, inflection points; derivatives of functions of several 
variables; homogeneity; higher partial derivatives and applications; ele- 
ments of integration. It may be remarked that the treatment of calculus 
given Неге is far less extensive then that contained in the standard calculus 
courses normally given in the freshman and sophomore years to physical 
science students. Part II also treats thé following topics in economics: de- 
mand functions and total revenue functions; total and average-cost func- 
tions; marginal cost; marginal revenue; elasticity; elasticity of demand; 
marginal revenue and elasticity of demand; increasing and decreasing mar- 
ginal costs; monopoly; average and marginal cost; marginal productivity; 
partial elasticities of demand; joint production; utility theory; production 
nudar free competition; marginal cost, total cost, average cost; and consum- 
er’s surplus, 

Part ш, entitled Probability and Statistics, fills pages 193 tó 309 and con- 
tains a brief but significant treatment of probability and statistics. The chap- 
ter headings are the following: probability ; random variables; moments; 


BOOK REVIEWS 659 


binomial and normal distributions; elements of sampling; tests of hypotheses; 
fitting of distributions; regression and correlation; index numbers; and a 
postscript which contains suggestions for further reading. In contrast with 
the first two parts of the book, economic theory is conspicuous by its absence 
from Part III. To be sure, the fitting of demand and supply curves is treated 
on pages 292 to 297, but in general this part of the book appears to be a 
rather straightforward treatment of statistics. 

Pages 311 to 340 contain answers to the odd-numbered exercises. The next 
section of the book consists of six tables as follows: four place common 
logarithms (pp. 341-342); natural trigonometric functions (pp. 343-340); 
four place natural logarithms (pp. 347-348); area of the normal probability 
curve (p. 349); Student’s t-distribution (p. 350); and the x? probability scale 
(p. 851). The book ends with an index of names (p. 355); an index of mathe- 
matical and statistical terms (pp. 357-360); and an index of economic terms 
(pp. 361-363). 

Many departments of mathematics now recognize a reponsibility to teach 
mathematics for the social and biological sciences as well as for the physical 
sciences. Professor Tintner’s book emphasizes a number of the problems that 
face these departments in their efforts to discharge their new responsibilities. 
The typical instructor in mathematics will feel that he is not competent to 
teach Mathematics and Statistics for Economists because his knowledge of 
economics and statistics is inadequate. A major problem in the introduction 
of a new undergraduate curriculum will be the training of the present 
mathematics staffs to teach the new currictlum. The fact that Professor 
Tintner holds the unusual title of Professor of Economies, Mathematics, find 
Statistics emphasizes that he is not a typical staff member. A major problem 
concerns the organization and arrangement of a curriculum designed to serve 
the needs of all those fields which now make significant use of mathematics. 
Must departments of mathematics now offer separate courses for physicists 
and chemists, for engineers, for economists, for psychologists, for biologists, 
and so on? The small liberal arts colleges, in which so many of our scientists 
and other scholars originate, will find suth an arrangement impossible be- 
cause of their limited staffs and the srhall number of their students. Many 
university educators and administrators will oppose such specialized courses 
on a variety of grounds. But is it possible to devise a freshman course in 
mathematics that the entire university will find adequate and acceptable? 
The question is more easily asked than answered. Finally, Professor Tintner’s 
book emphasizes that the needs of the economists will not be satisfied by a 
course in statistics. It appears significant that, as pointed out above, the 
concepts and principles of economics occur in Parts I and II rather than in 
connection with statistics in Part III. 

Whatever ‘the ultimate solution of the problems involved in teaching 
mathematics to economists, both the mathematicians and the gconomists are 
indebted to Professor Tintner. Fiehas written what appears to be a teachable 


660 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


textbook in a new field. In particular, he has provided an enormous collection 
of significant and vital exercises in economies which involve the elementary 
parts of mathematics. The extent of this collection of exercises is indicated 
by the fact that, as noted above, the answers to the odd-numbered exercises 
alone fill 30 pages of the book. One of the major problems that confronts the 
mathematicians in their efforts to revise their elementary curriculum is the 
collection of similar exercise material which will relate the old and new 
mathematics to the various fields and subjects where it finds application. 


Hood, William C., and Koopmans, Tjalling C., editors. Studies in Econometric 
- Method. Cowles Commission Monograph No. 14. New York: John Wiley and 
Sons, 1953, Pp. xix, 323. $5.50. 


Kennetu J. Arrow, Stanford University 


"Г\нїз collection of ten studies of problems in the estimation of simultaneous 
structural equations is indispensable for the modern econometrician. It 
gives a virtually complete picture of the present state of the subject and at 
the same time is eminently readable. Of the papers, three have been pre- 
viously published, the remainder being specially written for this volume. 

The present volume follows a pattern already set in earlier collective works 
published by the Cowles Commission; there is one long paper which sets forth 
systematically the basic ideas of the subject, while the remaining papers pre- 
sent more detailed expositions or further developments. In this case, the 
central paper in Chapter VI, by Koopmans and Hood, an admirably clear 
exposition of the model of linear simultaneous stochastic difference equa- 
tions, the definition of exogenous, endogenous, and predetermined variables, 
the criteria for identification in this model, the maximum-likelihood method 
of estimation with particular reference to the limited-information case, and 
some statistical tests of the validity and the identifying power of a priort 
restrictions. The derivations are new and very much simplified from earlier 
versions, though perhaps one would hardly call them simple in an absolute 
sense. "Though the article is not written in textbook form, it will form essen- 
tial supplementary reading to а book; such as Klein's, if it is desired to supply 
the student with a derivation of the basic estimation formulas. 

‘The first paper, by Jacob Marschak, is an excellent exposition of the role 
of statistical inference in economic policy and prediction. The concepts 
studied by Koopmans and Hood are here introduced in relation to the uses 
for which they are intended. The second paper, by Koopmans, is a non- 
technical exposition of the concept and problems of identification; it is re- 

. produced, with minor revisions, from a paper published in Econometrica. 
The careful discussion of various examples will be invaluable pedagogically. 
Tn the third paper, Herbert A. Simon seeks, on the basis of tie concepts of 
exogenous and endogenous variables, to define the notion of causality in а 


1004, 8: Klein, А Teztbook of Econometrics, Evanston and White Plains: Row, Peterson, and Company, 


BOOK REVIEWS Tif : 661 


way which will meet positivist objections, such as those of Hume. He argues 
that the complete rejection of the concept of causality (as opposed to func- 
tional interrelationship), as in Russell's position (to which, however, Simon 
does not refer), does not correspond to the intuitive practice of scientists. 
An interesting discussion is then given of causal ordering of variables in a 
linear structure; the concept is closely related to that of identifiability. 

The fourth and fifth papers, by Trygve Haavelmo, and by M. A. Girshick 
and Haavelmo, respectively, are now famous empirical applications of the 
simultaneous equations method to the consumption function and to the 
demand for food. 

The seventh paper, by Herman Chernoff and Herman Rubin, shows, 
without proofs, how limited-information estimates may be used even when 
the conditions under which they were derived are not valid, in the sense that 
they give rise to consistent estimates. From the point of view of new knowl- 
edge, this is undoubtedly the most important paper of the volume. It is 
remarked that if predetermined variables in the system but not in the group 
of equations to be estimated are omitted, the estimates resulting will still be 
consistent if the variables omitted are not needed for identification. It is also 
shown that in many cases errors in the variables and non-linearities in the 
equations can be accommodated. 

In the eighth paper, Stephen G. Allen studies, in an example, the loss of 
efficiency by omitting a predetermined variable in a particular equation to 
be estimated. In the ninth paper, Jean Bronfenbrenner (Crockett) examines 
the bias attributable to the use of the method of least squares in а two- _ 
equation model. Both papers are very illuminating in giving a more intuitive 
appreciation of the sense in which the simultaneous equations methods is 
optimal. ^ 

The last paper, by Herman Chernoff and Nathan Divinsky, is an extremely 
complete exposition of the computational methods used in various types of 
maximum-likelihood estimates. The practicing econometrician will make ex- 
tensive use of this section. 


This is a very useful collection of papers; which I can strongly recommend. 
pest irc it 


Stochastic Processes. J. L. Doob. New York: John Wiley and Sons, 1953. Pp. 
vii, 654. $10.00. 


P. A. P. Moran, Australian National University, Canberra б 


me average statistician or worker in applied probability theory never has 
to deal with more than a finite number of random variables and thus 
never really needs to know much about measure theory. But the mathema- 
tician who wishes to found probability theory on a rigorous basis needs more 
elaborate theory. The strong law of large numbers (which is essentially an 
empirically unverifiable theorem) neceSsarily involves a theozy of measure 
in a space with an enymerable infitiity of dimensions while the theory of con- 


662 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


tinuous random processes requires, for a fully rigorous foundation, a deep 
discussion of measure theory. In the present very remarkable book Professor 
Doob sets out a fully rigorous foundation for the theory of random processes 
both with discrete and with continuous parameters and in the course of doing 
this discusses a number of theorems of interest to those working on other 
parts of pure probability theory. The result is the most complete discussion 
yet published in book form of the foundations of the theory of processes. 

The book opens with a chapter on probability theofy which is openly and 
frankly (and in the opinion of the reviewer, rightly) equated with the theory 
of measure. The elementary ideas of random variables, families of variables, 
and modes of convergences for random variables are very carefully intro- 
duced and then follows the most important part of this chapter—a discussion 
of conditional probabilities. This is usually skirted around in textbooks since 
a rigorous treatment requires considerable care, and was first given by 
Kolmogoroff. Then, after some standard results on characteristic functions 
there are some very interesting new inequalities between the tails of a dis- 
tribution and integrals involving its characteristic function. 

In the second chapter the author introduces the idea of stochastic process 
and considers the classification of such processes in general. For a process in 
which “time” (here always represented by a variable on a linear set) is con- 
tinuous we immediately get into difficulties in setting up a probability meas- 
ure in the class of all realizations of such a process. These are overcome by 
insisting that the processes be “separable” in a certain sense introduced by 
the author. This is probably the most difficult part of the book to the average 
reader and requires a knowledge of measure theory beyond that of most 
analysts. A useful supplement at the end of the book attempts to bridge the 
gap. Next, the author introduces Gaussian processes and gives a discussion 
of the Markovian property which is rather more careful than is usual. 

The next chapter on processes with mutually independent variables is 
concerned with classical results in the theory of probability, most of which 
will ba more or less familiar to the reader. These are not discussed for their 
own sake but for the light they throw on random processes. The Borel- 
Cantelli lemma and similar results ôn series of variates and the law of large 
numbers are discussed with great care. Next we have a welcome account of 
infinitely divisible distributions and Lévy’s general formula for their char- 
acteristic functions. The arithmetic of distributions is not, however, con- 
sidered further than the needs of the theory of processes requires. A general 
account of this subject in English is much to be desired but remains to be 
written. 

Chapter IV considers processes with mutually uncorrelated but not 
necessarily independent variables. This is thus more general and the interest 
lies in seeing how far we can get with the weaker assumption. 

A Chapter V; on Markov chains with a discrete parameter deals with rela- 
tively familiar theory. The usual theory.of a finite Markov chain with sta- 


BOOK REVIEWS 663 


tionary transition probabilities, and the classification of the states, is given 
together with a short application to card mixing. Multiple Markov chains 
in which the transition probabilities depend not merely on the previous 
states but also on earlier states are reduced in the obvious way to ordinary 
Markov chains. This, however, is an illustration of the way in which the 
author deals only with the general theory of the subject (difficult as that is) 
without dealing with the analytical difficulties which arise as soon as we 
specialize the theory to some particular problem. Complex Markov chains 
(as Markov originally called them) are very awkward to deal with in prac- 
tice. Next we have a long and rather difficult account of the generalization 
(mainly due to Doeblin) of the previous theory té general state spaces and 
the corresponding law of large numbers and a central limit theorem, 

Next we have a chapter on Markoy processes with a continuous parameter, 
firstly those with a finite number of states with the usual theory of forward 
and backward differential equations and then to a continuous state space 
and chains with an enumerably infinite set of states, the latter being nowa- 
days of very great importance in such subjects as the theory of queueing. 
The Fokker-Planck and related equations are then considered at some length. 

Chapter VIT is a long (100 pages) and interesting chapter on what are now 
known as martingales. This word, which is due in this connection to J. Ville, 
is usually used in connection with the harness of horses and the rigging of 
ships but is also used, for some odd reason, for a gambling system in which 
the stake is increased by a factor of two at each trial. In probability theory 
a martingale is defined as a random process {£} such that £ (|z;| ] « and 


„= E (2i, | 2a, DOG, Th} 


with unit probability, whenever tı < - + - «1,4: and n is an arbitrary positive 
integer. This watery looking object does not look at first sight as if it would 
have a very interesting theory, but the theorems and results which follow are 
most interesting and varied and introduce a newsunity into a widely scat- 
tered series of problems. Much of the work described is due to the aythor. 

He first considers applications to games of chance where the idea of 
martingale is used to provide a definition of a “fair” game. This leads to a 
new and interesting discussion of the effect on fair games of systems of op- 
tional stopping and sampling. Next follow some new inequalities for expec- 
tations in martingales and a series of convergence theorems. The general 
theory is then applied to sums of independent variables, to the strong law of 
large numbers, the theory of derivatives, and the relationship to the likeli- 
hood ratio and sequential analysis pointed out. The corresponding theory of 
martingales with a continuous parameter is then developed at some length 
together with, some remarks on the application to Poisson and Brownian 
Processes, 

In the eighth chapter processgs withsndependent increments, which form 
a logical prelude to the study eof stationary processes, are considered. 

‹ 


664 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


Examples of these are provided by the elementary Poisson process and the 
Brownian movement. The centering of the general process of this type and 
the classical theory of the character of the distribution function (due to its 
infinite divisibility) are discussed. Cramér's classical theorem on the sum of 
two independent random variables (for which an ‘elementary’ proof is 
much to be desired) is mentioned but not proved or used in this treatment. 

In the following chapter the previous theory is generalized to processes 
whose increments are only orthogonal instead of independent, and this is 
combined with a detailed discussion of stochastic integrals. An interesting 
application mentioned is that to Campbell's theorem. However, the applica- 
tion of this theorem to electrical noise is not developed and Campbell's name 
is not to be found in the bibliography. An interpretation of the idea of a 
Fourier transform of an actual realization of a process is followed by a gen- 
eralization and further discussion of stochastic integrals. 

Next, in chapter X, we come down to stationary processes with a discrete 
parameter which are prefaced by a detailed discussion of measure preserving 
transforms, The strong law of large numbers and the Wold-Khintchine 
theorem are proved and illustrated and this is followed by a discussion of 
the effect of linear operators on the spectrum of a process and of processes 
with rational spectra. In the following chapters all these ideas are discussed 
for continuous processes. 

The final chapter is a rigorous discussion of linear least squares prediction 
for stationary processes, a subject which is of great interest in practice. The 
practical applications are, however, not discussed. It is natural to confine 
the discussion to linear least squares prediction but it is worth pointing out 
that there is a need Tor more research on cases where non-linear prediction is 
better. Consider, for example, a discrete process {zn} where each т, is dis- 
tributed uniformly on (—1, 1) and „ы =1—2|х„|. Then all the serial cor- 
relations are zero but an exact predictor always exists. There is also a need 
for research into processes which are generated by nonlinear difference equa- 
tions ат processes which are not symmetric about their means. 

The author is much to be congratulated on this very important book. 
His aim of giving a rigorous foundation 40 the subject is, no doubt, mainly 
responsible for there being no space for a discussion of the problem of in- 

 ferting the structure of a process from wsample realization, and very little 

discussion of particular processes. As a result much work on processes is 

never referred to, and the names of Yule, Bartlett, Pitt, Quenouille and the 

_ Kendalls never appear, a fact for which the author rightly apologizes in his 

Cou The printing, binding, and textual accuracy are of the very highest 
anys 


| 


TS 


e 


— 


: 


BOOK REVIEWS E 665 


Introduction to the Theory of Stochastic Processes Depending on a Continuous 
Parameter. Henry B. Mann. National Bureau of Standards, Applied Mathe- 
matics Series, 24. Washington: U. S. Government Printing Office, 1953. Pp. v, 
45. $0.30. 


Urr GRENANDER, University of Stockholm 


ҥз booklet should be useful for a reader who wishes to find out quickly 

and not in too great detail what sort of questions are dealt with in the 
theory of stochastic processes. Its 45 pages contain a good deal of information 
on this subject. 

After a discussion of some basie concepts, the author defines a stochastic 
process as а one parameter family of stochastic variables and studies various 
linear operations such as differentiation and integration. Some special proc- 
esses are discussed in Chapters 2 and 4, mainly of independent increments, 
and related statistical problems are dealt with in Chapter 3. In Chapter 5 
counter data are considered as forming a stochastic process and Chapter 6 
finally is devoted to harmonic analysis of processes and the mean ergodic 
theorem. 

The clarity of the exposition and the simplicity of the mathematical 
machinery that is used makes the book easy to read. The reader who wants 
to pursue the topic further can find a more complete treatment in two 
recently published books, J. L. Doob’s Stochastic Processes (Wiley 1953), and 
A. Blanc-Lapierre and В. Fortet’s Théorie des fonctions aléatoires (Masson 
et Cie, 1953). 


e y 


Small Particle Statistics. Gustav Herdan. Amsterdam: Elsevier Publishing Com- 
pany, 1953. Pp. xxiii, 520. $12.00. ° 


BrnsaMin Epstein, Wayne University 


N MODERN technology, particles ranging in size roughly from 10 down to 

10-5 centimeters play a very important role. For example, the strength 
of glassware, ceramic ware, or cement depends to large degree on the fineness 
and size distribution of the raw material being used. The resistance“of dyes 
and paints to weathering and many other physical properties are strongly 
affected by the size distribution of the raw materials used and by the way in 
which they are dispersed throughout the dye or paint. The health of workers 
in a factory is affected by the kind, density, and distribution of pollutants 
(generally fine particles) in the atmosphere. In nature, microscopic soil prop- 
erties are of great importance in sedimentary petrography, in agriculture, 
and in soil physies. The suitability of coke as a blast furnace fuel can be pre- 
dicted to some extent from the kinds of size distributions obtained when a 
sample of coke is broken up into small pieces by the application of various 
breakage pro¢esses, It is virtually impossible to study such properties as 
“grindability,” “resistance to impact,” “resistance to abrasion,” and the like, 
without dealing with particle sfze distributions and how they change, for, 
example, with time orenergy expanded. * 


666 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


The author gives an exhaustive account of the current state of knowledge 
in this field. Since the particles being measured are generally quite small this 
raises many problems. For example, the problem of how to prepare a sample 
for measurement, how to carry out the measurements, and what to measure 
are quite involved technically and important statistically because of the 
way in which they can affect the data with which the statistician will be 
asked to work. The author treats this aspect of the subject admirably, going 
into such things as sieving, sedimentation, microscopic, and adsorption meth- 
ods, etc. He also considers various ways of recording data whether by size, by 
surface area, by weight, ete., and indicates the physical reasons why one 
measure might be preferable to another depending on circumstances, The 
reviewer can say, from his own experience, that the statistician called upon 
to give advice in this field would do well to be aware of such technical 
considerations. 

Roughly half of the book is devoted to a treatment of elementary statistics 
and the elements of experimental design. This was done by the author in 
order to make the book self-contained. In the opinion of the reviewer, it 
would have been better to eliminate most of this material and advise the 
reader to become acquainted with a basic applied statistics book such as that 
recently written by A. Hald. Specifically, a good deal of the material in 
Chapters 2, 4, 6, 7, 8, 10 could have been omitted or treated with greater 
brevity. By and large the author's treatment of statistics is sound. The 
author does, however, seem to slip up in the examples on p. 160 and p. 162, 
since one should surely separate out the effect of variation among the 
laboratories in running the tests of significance. Analysis of variance is called 
for and not a simple t-test. 

The statistician will find parts of this book very interesting. Among the 
specially noteworthy features are: (1) critical discussion of how to draw a 
sample, what and how to measure, kinds of errors likely to appear, etc.; (2) 
discussion of the distribution laws arising in particle statistics (Chapter 6); 
(3) discussion of the mechanism of crushing and grinding and associated 
statistical questions (Chapter 13); (4) statistics applied to problems of mixing 
(Chapter 14); (5) consideration of the statistics of polymerized materials 
(Chapter 15); (6) sampling procedure in sedimentary petrology (рр: 
417-429). Another good feature of the book are the many illustrations, ex- 
cellent figures, and extensive bibliography. 

"Dr. Herdan wrote Chapters 1-17. Chapters 18-23 were written by Dr. 
M. L. Smith. These latter chapters are devoted to a eritical discussion of 
various experimental methods for determining both size distribution and 
surface area. This is a complicated problem in the subsieve range and re- 
quires very delicate experimental procedures. 

To sum up, the book is a “must” for all who work in the statistics of fine 
он should give the statistician а good deal of insight into the prob- 

peculiar to this field. It should give the technologists and scientists an 


BOOK REVIEWS 667 


appreciation of what statistical methods can accomplish in this area. The 
book should certainly stimulate healthy cooperation. 


U. S. Army, Ordnance Corps: Tables of the Cumulative Binomial Probabilities. 
Ordnance Corps Pamphlet ORDP 20-1, September 1952. Pp. уііі --577. 9X12 
inches. For sale by Office of Technical Services, Department of Commerce, 
Washington 25, D. C. at $6.00 per copy. Orders should cite Order No. PB 111389. 
ніз is by far the most extensive table of the binomial distribution published 
yet. Cumulative probabilities are available to seven decimals for popu- 
lation proportions from 0 to 1 by steps of 0.01, for sample sizes through 150 
by steps of 1. An Introduction gives a good explanation of the use of the ta- ` 
bles, and explains their relations to the Incomplete Beta Function Ratio. 
These tables will be indispensable to all practicing statisticians concerned 


with the binomial distribution. 
W.A.W. 


The Theory of Inventory Management. Thomson M. Whitin, Princeton: Prince- 
ton University Press, 1953. Pp. viii, 245. $4.50. 


Rozert DorrMan, University of California, Berkeley 


Inventories and their management go back to Joseph, advisor to Pharaoh, 
at least. The theory of inventory management has a much shorter history, 
however. Up to the 1920’s, inventory policy seems to have been based largely 
on rule-of-thumb. Some of the simpler problems were formulated and sokved 
during the 1920’s, but the development of a systematic theory as a branch 
of economic and managerial science was undertaken only after World War 
II, as an aspect of the operations research movement. 

Whitin’s monograph is an introduction to this promising and fast-growing 
field. It is divided into three parts. In Part I, Whitin develops the principles 
of inventory management in the individual firm antl compares his conclusions 
with those of earlier writers, particularly Eiteman and Boulding# Part II 
deals with the effects of inventory poligies on fluctuations in economic activ- 
ity and the implications of inventofy theory for theories of general economic 
equilibrium. In this connection the theories of Keynes, Metzler, and Leon- 
tief are examined in the light of the results of Part I and of empirical data. 
Part III treats the inventory problems of the national military establish- 
ment and applies some of the principles of Part I to them. 

The characteristies of an optimal inventory policy for an individual firm, 
dealt with in Part I, are the foundation of the entire treatment. There are, 
essentially, two issues to be considered in formulating an inventory policy. 
The first of these concerns cost minimization: the costs of carrying large 
inventories have to be balanced against the costs of reordering supplies at 
frequent intervals. If the cost functions are simple enough, tlfe size of order 
which minimizes the,sum of the 8rdering*cost and the carrying cost per unit 


668 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


(this is the economic purchase quantity) can be determined by the differ- 
ential calculus. Whitin shows that this leads immediately to two interesting 
consequences: first, the economic purchase quantity and therefore the 
average size of inventories varies in proportion to the square root of the vol- 
ume of sales; second, the superficial dictum that “the higher the turnover 
rate the better” is misleading because it may lead to excessively high reorder- 
ing costs. 

The second main issue concerns risk minimization: the risks of overstack- 
ing associated with large inventories have to be balanced against the risks of 
depletion associated with small inventories and against the disorganization 
of production and sacrifice of sales and goodwill which depletion entails. 
This is a far more complicated problem than cost minimization. The major 
aspects of this problem are: (1) the measurement of the losses which would 
result from overstocking if it occurs and from depletion if that occurs, i.e., 
the determination of the “loss function” in Abraham Wald’s terminology; 
(2) the estimation of the probability distribution of withdrawals from inven- 
tory, which determines the probability of the various possible losses associ- 
ated with a given inventory policy; (3) the establishment of a policy eri- 
terion, be it minimum-maximum loss, minimum expected loss, or whatever; 
(4) the consolidation of these three aspects into a decision procedure which 
determines optimal inventory policy as a function of observable data. 

Whitin’s handling of the risk problem is different in content and spirit 
from the quadri-partite analysis sketched above. There is little or no dis- 
cussion of the loss function or of the problems of ascertaining it, and the 
worked-out illustrations depend on the assumption that the losses resulting 
from depletion are a known dollar-and-cents unit cost multiplied by the 
amount of the inventory deficiency. It is assumed without argument that 
the objective of inventory policy is to: minimize the expected loss, so that 
the issue of selecting a policy criterion does not arise. Nor does Whitin bring 
up the problem of an integrated decision procedure. Instead he handles 

| separately the two sub-problems of (a) determining the probability distribu- 
tion of withdrawals from inventory and (b) determining the cost-minimizing 
policy assuming that the probability distribution is given. 
| Indeed, Whitin simplifies the problem even further because most of his 
discussion concerns the ease, relatively infrequent in practice, in which the 
probability distribution is known in advance. In his most extended treatment 
of a case with an unknown probability distribution, the case of style-goods, 
he recommends simply that the distribution be estimated by asking buyers 
to forecast the maximum amount of sales they foresee and adjusting these 
Eos in the light of the past performance of the forecasters. (See рр. 
Ure ded alter al these simplifications is the problem of minimizing 
pone сч and losses, given the probability distribution of withdrawals 
. The problem is further limited in much of the development by 


BOOK REVIEWS 669 


assuming that the acceptable level of risk of running out of stock has been 
predetermined. The cost of all these restrictions is suggested by the work of 
Dvoretzky, Kiefer, and Wolfowitz,! who have shown that if reordering costs 
are appreciable, a policy of a type excluded by Whitin may be optimal, 

Whitin’s simplifications have the virtue of rendering the inventory prob- . 
lem amenable to the methods of the differential calculus, or the marginal 
analysis beloved of economists. Even though Whitin’s methods would be 
inadequate in many practical situations, in circumstances in which uncer- 
tainty is unimportant and in which the costs of overstocking and under- 
stocking can be calculated without difficulty they should provide useful 
guidance, Moreover, some of the consequences of the analysis, especially the 
tendency of optimal safety margins to increase according to the square root 
of the level of sales, are generally valid and highly suggestive. 

As applied to the study of economic fluctuations and general equilibrium — 
in Part II, the main consequence of the theoretical analysis is that inven- 
tories tend to increase in proportion to the square root of the level of eco- 
nomic activity. Metzler, Boulding, and Leontief* have all assumed, at one 
time or another, that inventories vary in direct proportion to the level of 
activity. Their theories are, therefore, subject to criticism. Whitin also pro- 
duces some empirical data which tend to support his position, He concludes, 
rather tentatively, that although the square-root law mitigates the de- 
stabilizing effect of inventories on business cycles, inventories probably do 
contribute to cyclic instability. 

Part III discusses the inventory problems df the national military estab- 
lishment. The wastefulness of some current rule-of-thumb practices” is 
pointed out in convineing detail. In this context Whitin raises one of the fun- 
damental questions which he neglected in his general treatment: the estima- 
tion of the cost of running out of stock of some item or, in other words, the 
determination of the marginal value of an inventoried item. His proposal 
is to apply the methods of game theory, that is, tg calculate the value of a 
war game with a given stock of the item and compare this with the value 
computed with the stock increased by onf unit. He nowhere mentions the 
use of the closely related methods of li&ear programming which, also, yield 
estimates of the value of inventories in military and private organizations. 

This is the first full-length treatment of a new field. The exposition is 
generally clear and the mathematics employed are simple and familiar. Many 
of the results included have already been applied successfully and the treat- 
ment unmasks a number of common fallacies about inventory management 
and behavior. This book can therefore serve usefully as an introduction to- 
the field. But the reader should be warned that this monograph contains 


1A. Dvoretzks, J. Kiefer, and J. Wolfowitz, "The Inventory Problem,” Econometrica, 20 (1952), 


187-222, 450-66. 
* Lloyd A. Metzler, “The Nature and Stability оф Inventory Cycles,” Review of Egonomio Statistics, 
August 1941; Kenneth E. Boulding, A. ion of Economics (New York, 1950), Part I; W. W. 


Leontief and others, Studies inthe Structure oPthe Amerftan Economy (New York, 1958), Chapter 3. 


670 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


only a smattering of what is known today about the theory of inventory 
management and that what is known today is only a smattering of what we 
shall need to know before the theory is ready for wide application. 


The Role of Mergers in the Growth of Large Firms. J. Fred Weston. Berkeley: 
University of California Press, 1953. Pp. xvi, 159. $3.50. 


See review article by G. Warren Nutter on page 448. 


Studies in Income and Wealth, Volume Fifteen. Conference on. Research in In- 
come and Wealth. New York: National Bureau of Economic Research, 1953. 
Pp. x, 230. $3.50. 


H. S. Hovrnaxxzn, Stanford University 


mrs volume of the well-known series contains eight papers presented in 

1950 at a conference in Allerton Park, Illinois. Although the meeting was 
intended to deal with the distribution of income by size, only one of the 
papers is strictly concerned with that subject, most of the others being de- 
voted to problems arising in the cross-section analysis of the use of personal 
income. It must be hoped that a future conference will go into the original, 
relatively unexplored area, but the quality of some of the papers collected 
here compensates for the change in emphasis. : 

The only paper on income size distributions, by George Garvy, is mainly 
expository. Of greater interest is a contribution by D. Gale Johnson, who 
tries to elucidate the low incomes of southern farm families by comparing 
the income of non-farm families in the south and elsewhere. Though ham- 
pered by apparent inconsistencies in the data, he advances some remarkable 
conclusions which are supplemented by comments from other students of 
regional incomes, who present new data. 

In a paper written in 2935, but not hitherto published, Milton Friedman 
suggestova new method of ranking families of different composition by their 
relative economic status. Following Sydenstricker and King’s pioneering 
article in the 1920-21 volume of this Jéurnal, he proposes the estimation of 
weights for each class of family members such that, when both income and 
a particular item of expenditure are divided by the sum of the relevant 
weights, the resulting relation is independent of family composition. As is 
recognized by the author and confirmed by calculations in Jean Mann Due’s 
comment this method leads to inconsistent results because the weights de- 
pend on the expenditure item considered. It is therefore surprising that 
Friedman rejects, on “pragmatic” grounds, a more acceptable definition, 
which distinguishes between income weights and specific weights. The latter 
approach occurs in В. С. D. Allen’s contribution to the Schultz memori 
volume (Studies in Mathematical ‘Econonics and Econometrics, Chicago, 
1942) and has been used with some success by authors associated with the 


BOOK REVIEWS s 20 7l 


Cambridge Department of Applied Economies (partieularly by S. J. Prais 
in the Economic Journal of December 1953). 

It is perhaps even more surprising that Professor Friedman should have 
been so ready to infer the “egonomic status" of different families from their 
incomes and expenditures only. Since these data by themselves show only 
shifts in demand functions, not effects on satisfaction, any such endeavor 
necessarily involves additional, usually unstated, assumptions, which reflect 
nothing but the preconceptions of the investigator. The objections to inter- 
personal comparisons of utility apply with full force here, especially because 
the presence or absence of children may itself have a strong influence on 
well-being and this influence cannot normally be isolated. 

Much the same point has to be made in connection with Mollie Orshan- 
sky’s attempt to find “equivalent” levels of living for farm and city dwellers. 
Except by asking highly sophisticated questions there is no way of deter- 
mining the income at which city families are as well off as farm families with 
2 given income, The particular method followed in this paper, first suggested 
by Dorothy S. Brady, is based on the alleged existence of income levels at 
which the income elasticity of quantity (as distinot from expenditure) 
reaches a maximum. Quite apart from the statistical difficulties in estimating 
them, the mere fact that these income levels, if they exist at all, are not the 
same for all commodities proves that they have no welfare significance 
whatever. Nevertheless Miss Orshansky’s calculations bring out some in- 
teresting features of household expenditure patterns. ig 

Data collected by the Survey Research Center of the University of Mighi- 
gan are analyzed by Janet A. Fisher, who is concerned with the incomes, 
savings, and assets of families during their economic ше cycle. Most of the 
figures are classified by the age of the household’s head, and the remarkable 
regularities that appear agree fairly well with a priori expectations. Further 
research should reveal whether the use of other indicators of the economic 
age of the family, such as the duration of the matriage and the number of 
children, will yield still more informative gesults, but Miss Fisher’saata did 
not allow such analyses, The importance of this field of investigation is 
underlined by recent developments 3h the study of the consumption function, 
which emphasise its dynamic aspects. 

Dynamic factors, though of a different nature, are also considered in an 
ingenious but obscure contribution by Dorothy S. Brady. By comparing 
budget surveys from various periods she tries to identify a “normal form” 
of the consumption “function, in which not only the level of but also the 
change in income is regarded as a determining variable. Not having any data 
on income changes at her disposal, she apparently allows for the latter effect 
by using community income as well as family income. The precise logic of 
her method is unfortunately not made, clear. It might be suspected, for in- 
stance, that in comparing the 1977-19 with the 1935-36 survey changes in the 
price level should be &aken into atcount,*but nothing is said on that topic. 


672 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


‘Nor is the reader’s bewilderment allayed by cryptic footnotes such as *Ex- 
penditures and savings were standardized to an average family size of 3.5 
persons.” It is quite possible that Mrs. Brady’s approach leads to impor- 
tant results, but the present exposition is mystifying rather than convincing. 

Margaret G. Reid’s valuable paper illustrates J. R. Hicks’ dictum to the 
effect that the income one can calculate is not the income one seeks, whereas 
this true income escapes calculation. She starts from the observation that 
the income elasticity of consumption is less for farm families than for others, 
and analyzes to what extent this may be due to the income concept used 
rather than to differences in behavior patterns. The choice of concept is 
particularly important in the case of farm families because of the difficulty 
of separating operating and living expenses and the related problems of de- 
preciation and inventory changes. Similar questions arise for all families 
with highly variable incomes. Miss Reid surveys some alternative classifica- 
tions of families with reference to the resulting biases of this nature, but she 
defers a definite recommendation. 

In a coneluding paper Simon Kuznets outlines directions of further in- 
quiry. His stimulating suggestions, which demonstrate a penetrating insight 
both in what is most necessary and in what is feasible in empirical research 
are too numerous to be discussed here. He might have put a little more em- 
phasis on theoretical research, however; even if this does not lead to immedi- 
ately applicable conclusions it may yet help to clarify the issues and methods 
involved. Several of the contributions to this volume would have been im- 
proved if the models used had been more explicitly formulated. As it stands 
the volume shows impressively what a wealth of basic information is already 
in existence; the main problem for the near future is how to exploit it more 
effectively. Despite their various, and mostly excusable, defects these papers 
constitute a significant advance in the solution of that problem. 


Better Population Forecasting for Areas and Communities. Van Beuren Stan- 
bery. (Domestic Commerce Series”No. 32). U. S. Department of Commerce. 
he Hon D. С.: U. 8. Governmen* Printing Office, 1952. Pp. iv, 80. 25 cents. 
Paper. E 


FREDERICK Е. STEPHAN, Princeton University 


qas problem of making forecasts of population growth for cities, counties, 

X metropolitan districts frequently arises in the work of local industries, 
publie utilities, zoning and planning agencies, and municipal organizations 

Я ‘concerned with the provision of schools, water, sanitation and other services. 
Frequently they find it necessary to plan far ahead because their facilities 
- must be built in single units that can not be enlarged without excessive ex- 
pense. Forecasts of local population growth are affected greatly, of course, 
by the factors that influence the location of.new industry and the expansion 
of business and other factors that affectsemployment in, and migration of 


T 


BOOK REVIEWS 673 


population to or from the area, In addition to these factors, there, are of 
course, the natural growth of the resident population and the progression 
of each cohort through the ages at which it enters the labor force, establishes 
new households, and otherwise affects the total demand for the services for 
which the population estimates are being prepared. $ 

If the lot of the forecaster of population growth for the entire nation is a 
difficult and unhappy one—and recent experience would seem to make this 
undeniable—then the life of a forecaster of local populations must certainly 
be unbearable. For such a hazardous occupation it would seem that any ac- 
cumulation or accretion of knowledge and any tricks or devices no matter 
how crude, would be helpful. While it is too much to hope that a highly 
scientifie methodology could be put together for forecasting populations in 
all kinds of communities, there are some communities in which special factors 
do permit highly accurate forecasting if full advantage is made of them, 
Nonetheless, many statisticians look with misgiving at the relatively crude 
methods of estimation that must be employed, perforce, by anyone so brash 
as to attempt to make local population forecasts, ИГ 

The booklet that is under review is a contribution to “those who use or 
make population projections.” It reviews some of the more familiar methods 
of forecasting, such as the projection of past population growth, projection 
by use of the relation to national or regional areas, for which forecasts 
are already available, projection of migration and natural increase, and 
projection from specific estimates of future employment. Work sheets are 
sketched for these computations. Much advice is giver about things to 
consider. It is largely a manual of suggestions and cautions written insan- 
sophisticated language. * 

This guide book will be welcomed by many who want to see “better popu- 
lation forecasting." Perhapsit is not without its liabilities, however, forit may 
encourage mediocre work at the expense of really good analysis. The bibli- 
ography lists a number of studies that would have increased the value of 
the text if they had been used adequately as examples. Moreover, there is 
little evidence that the author has looked into the forecasting thf is done 
by the telephone and utility engineers or some of the regional business re- 
search bureaus. The methods of préjection could well be improved by use of 
data correlated with population growth, such as meter installations, to fill 
in the gaps between censuses and to permit more effective linkage to economic 
projections for the area. If this is done then more efficient statistical methads 
can be used. All this suggests that local estimates should be prepared with 
the advice and assistance of those who can contribute all the data and tech- 
niques and judgment that are available in the community and pertinent. 
Such a group would be likely to emphasize the width of the band of uncer- 
tainty that Cloaks the forecasts, an aspect that this booklet very properly 


does too. d js 
^ 


674 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


Population Changes in Europe Since 1939. Gregory (Grzegorz) Frumkin. New 
York: Augustus M. Kelley, Inc., 1951. Pp. 191. 


Doveras F. Down, Cornell University 


mrs study would appear, at first glance, to be one of modest compass, 

dealing with assuredly useful but nevertheless dreary data. In Mr. 
Frumkin’s hands the data come alive, his task is revealed as a staggering one, 
and one leaves the book shocked anew by at least one meaning of Hitler's 
War: the death and displacement of scores of millions of people. 
, Mr. Frumkin—now with the United Nations, and editor of the Statistical 
Year Book of the League of Nations throughout its existence—attempts “a 
systematic determination of the magnitude of the changes which have oc- 
curred in Europe's population since 1939 and of their determining factors 
...country by country, on a uniform pattern, by means of balance 
sheets...” (p. 9). 

In the brief first chapter, the population background of Europe (before 
1939) is discussed, separate charts for twenty countries showing birth and 
death rates and the natural inerease of population are presented, and we are 
furnished with a handful of pithy remarks and caveats hinging on the de- 
mographer's craft, but worth pondering by all those who study society: 
e.g., “there is sometimes more stability behind changing figures and greater 
mobility behind stable figures than one would imagine" (p. 19). 

Chapter II provides a lucid and interesting discussion of the balance sheet 
approach of computing population changes. This approach— "somewhat 
arduous, strictly inductive"—the reviewer found easy to follow, clear, and 
highly informative. The method may be illustrated by listing the items in 
the French balance sheet, 1939-1945: Population, end of 1938; Births; 
“Normal” deaths; War losses (military; civilians, non-Jewish; Jews killed); 
Population shifts, net balance; Population end of 1945. Similarly, with some 
different inclusions, for 1946-1947. Items are tallied plus or minus. 

Chapter III is the heari of the book. Here we find detailed balance sheets 
for twenty-four European countries, and a careful discussion of relevant data 
for the U.S.S.R. The distinction between the latter and the former rests on 
the non-availability of statistics from thé U.S.S.R. Each item on the balance 
sheet is systematically explained, the reliability of the figures discussed, and 
statistical conclusions are drawn. One cannot fail to be impressed by the 
Meticulousness of Mr. Frumkin’s research, and his success in achieving an 

‘unbiased enquiry”. On a different note, one cannot read these pages without 
becoming sickened by the violence and destruction and misery underlying 
the figures. 
ee the results of the study, in a valuable combined 
SRI etree n è Ў 3c ries, and goes on to say something of the meaning 
SQUE EE lire. Mae 5 the future. Some of these remarks are 
ЖИГ ith Spite of heavy losses, Europe . . . emerged from the 

war with a larger population in 1947 than ог the eve of the war." 


> 


BOOK REVIEWS 675 


(р. 175). “A characteristic feature of the last war was . . . that the main loss 
of population was due not to fighting, but to mass-murder by the German 
invader . . . Civilian losses were overwhelmingly concentrated in areas oc- 
cupied by the Germans, and the number of Jews murdered made up almost 
one-half of the total civilian losses. In Poland alone the number of Jewish 
victims exceeded 3 millions" (p. 182). “War and genocide . . . accounted . . . 
for the death of 15 million persons, almost 6 millions being military deaths, 
and over 9 million deaths among civilians" (p. 181). Add to this the 17 
millions of war losses estimated for the U.S.S.R., which Frumkin estimates 
as being *on the low side" (p. 164). Add the millions in the "exceptionally 
large shifts of the population chased across national boundaries" (p. 188). 
The book concludes with a grim warning: “As the result of World War II, 
national minorities in Europe have mostly disappeared. Mass-murders, 
shifts of frontiers and ruthless mass-transfers were the instruments by means 
of which the former European mosaie has been converted into an array of 
ethnically homogeneous units each with the sign: "Trespassers will be prose- 
cuted.’ These units, born under coercion, cannot be maintained ethnically 


' pure except through coercion” (p. 190). 


Mr. Frumkin throughout emphasizes the tentative nature of his con- 
clusions, Tentative though they may be, the diligence, the years of experi- 
ence, and the considerable talents of Mr. Frumkin whieh come through on 
every page would seem to indicate that it will be Frumkin or those following 
his methods who will improve upon what looks to be a definitive work. 


Bristol on the Move—A Travel Survey. British Transport Commission. Lond6n: 
1953. Pp. 46. 10s.6d. . 


Leonard P, Apams, Cornell University 


e study was authorized by the British, Transport Commission and car- 
ried out by Research Services Ltd. in the winter of 1950-1951. It was 
designed to provide information on the travel patterns of the people of 
Bristol (defined for present purposes to inglude those residing within a ten- 
mile radius [more or less] of the city) of jnterest to the sociologist, to the op- 
erator of public services, and to ће advertiser. The principal findings with 
respect to the methods used in travel, costs to the traveler, types of travelers 
(age, sex, income class, etc.), purposes of travel, time required, and other 
items are presented in a series of tables supplemented by a narrative discus- 
sion and some excellent photographs of the city and its environs. 

No doubt the data*obtained from this survey have been of service to trans- 
port operators and advertisers, and probably to some extent to sociologists. 
From the standpoint of those not particularly interested in the Bristol area 
per se but interested in local studies of travel patterns, the design of the 
Bristol survey and the methods used will be of interest. This reviewer has 
recently been studying commutifg pattèrns of industrial workers, so is prob- 
ably inclined to be more critical ® somé of the methods used than those 


676 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 — 


_ interested in administration or selling, although in some significant respects, 
as will be noted, the study has weaknesses from their point of view also. 
In designing the study Research Services Ltd. evidently assumed that 
‘there was little value in explaining the geographic limits of the Bristol area. 
This assumption may be well founded in fact but there is no explanation of 
why the ten mile limit was chosen or why the possibilities of longer distance 4 
travel to or from Bristol were ruled out. The authors admit that there is 
considerable traffic between Bristol and Bath which has been excluded from 
the study because the City of Bath “has transport services of its own which - 
would have to form the subject of a separate investigation.” But there 
' їз no way of telling from the report whether or not some of the households 
surveyed have closer economic ties to Bath than to Bristol even though 
they live within the ten mile radius from Bristol. Data on travel are not | 
given in terms of places of origin and destination. Omission of such infor- 
mation, together with a predetermined definition of the Bristol area, raises 
questions about the scope of the area and the relationship of its people to 
other population centers that the published summary does not answer. To — 
"what extent do the people of Bristol have economic and social ties with those 
| in Bath? How much cross traffic is there? For purposes of advertising and | 
labor supply marketing, should Bath and Bristol be combined in a single © 
area? i 
Presumably the investigators considered that for their purposes a more 
or less arbitrary definition of the Bristol area would be adequate. In any case, 
it facilitated the Selection of the sample households to be visited. The primary 
saiüple was chosen by taking every 150th address from the Electoral Reg- 
isters covering the sample area. A secondary sample consisting of the fifth 
address following each primary address was also selected at the time the 
primary was drawn, When the householder at the primary address could not 
be reached or would not cooperate the nearby secondary sample was sub- 
stituted. Tn case there were two or more households at the same address, 
additional interviews up to three were taken in order to avoid weakening - 
the geographic distribution of thé sample. Such additional interviews were і 
called “subsidiary” interviews. Each member of the household was ques- mi 
| tioned concerning his or her use of public transport in the seven days pre- 
| ceding the interview. The methodology used was, in general, the same as that 
followed in the London Survey, except that all use of public transportation 
in the preceding seven days was recorded instead of just the “regular” jour- — 
ys. Although no measures of the adequacy and relisbility of the sample 
are given, it seems probable that the results give a reasonably good picture 
of the characteristics of travel within the area selected. 
While a single cross-section view of travel patterns has value from the ў 
4 standpoi, t of operators and advertisers, it has severe limitations if one 18. 
Ее understand trends in housing location, the journey-to-work, mode 
of travel and other matters. Travel patteyns in the Bristol area, if they аге 
at all similar to those in industrial centers in this country, are probably dy- 


BOOK REVIEWS 677 


namic, changing with such factors as new housing developments and the 
location of new industrial plants—and, judging by some of the photographs 
in the report, the Bristol area has had some interesting post-war housing de- 
velopments. Changes in travel patterns probably could be measured with a 
fair degree of accuracy by the interview method. Experience with travel 
studies in this country suggests, however, that employer personnel records 
provide information more readily with respect to journey-to-work patterns 
and they also show*changes over time. If similar records are available in 
English plants, Research Services Ltd. in making future studies may wish 
to use this source to provide a basis for defining the geographic limits of the 
area to be surveyed and also to show changes in the distances traveled and 
characteristics of the work force employed. 


International Shipping Cartels. Daniel Marz, Jr. Princeton, N. J.: Princeton 
University Press, 1953. Pp. xiii, 323. $6.00. 


Rocxwoop Cain, Berea College 


Hz is a scholarly book about shipping conference agreements as a form 
of international cartel. This topic is usually neglected by books on inter- 
national cartels and inadequately treated in transportation texts. The author 
is to be commended for his comprehensive description of the nature, history, 
purposes, and methods of shipping cartels. Competition from non-conference 
liners and tramp shipping is also discussed. 

Particularly detailed is his analysis (in Chapters IV, V, end VII) of Ameri- 
can and British experience in the investigation and regulation of shipping 
practices dating from about the beginning of the twentieth century. All the 
main modern arguments for and against shipping cartels are virtually cov- 
ered in the early reports of the Royal Commission on Shipping Rings (1909) 
and the Alexander Committee (1914). Ott of the recommendations of the 
latter there evolved in the United States the Shipping Act of 1916 which, 
among other things, prohibited deferred rebates, fighting ships,” and unjust 
discrimination. Curiously enough, there i$ no specific reference to the Bland- 
Copeland Bill of 1935 which unsuccessfully attempted to reverse the 1916 
legislation. 

Chapter VI includes brief uneven sketches of restraints imposed upon 
national participation in shipping conferences by other selected countries— 
British Dominions and some European and Far Eastern countries. The 
economics of shipping conferences, discussed in Chapters II, III, and XII, 
supplemented by a note in the Appendix on discriminatory pricing, provide 
a theoretical background for the factual presentation. 

The reader may jump to Chapter XIV for the conclusions of the book 
without loss of the main argument. In trying to be objective, the author 
Sometimes appears to vacillate in his gttitude toward international shipping 
cartels, and there is occasional ambiguity as to what he is definitely recora- 
Mending. Generallyspeaking, he regards international shipping cartels as 


E o pA 
79 


678 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1954 


inevitable, capable of abusing their privileges, and also of suffering from 
wastes of competition. Monopoly and cutthroat competition are both de- 
cried. “The record seems to indicate that relatively few conferences have 
been guilty of very serious abuses, unless exclusive patronage contracts and 
pooling arrangements approved by the Commission are so condemned” (p. 
136). Regulation by national governments and an international agency (at 
the time of writing, the Intergovernmental Maritime Consultative Organiza- 
tion had not yet come into existence) would both have their limitations, 
though desirable for supervising tying arrangements and for preventing 
monopoly abuses and undue discrimination. Some sort of improvement in 
self-regulation through the existing conference system is favored, but the 
author does not make clear how this can actually be attained. 

Chapter IIT, on the general economic and political environment, deals with 
the relation of shipping to international trade, location of economic activity, 
and balance of payments. It contains the bulk of the book’s trade and ship- 
ping statistics. Elsewhere statistics are few. Though there is a table in Chap- 
ter XII comparing fluctuations in indices of tramp freight and conference 
general cargo rates, actually freight rates are cited rather infrequently. In 
Chapter IX there are also charts showing the names and membership of 
various freight and passenger conferences participating in United States 
foreign trade and travel. One misses, however, the complete text of a confer- 
ence agreement of the type described in the book, of which we are told 
there are over one hundred on file at the Maritime Commission (now the 
Federal Maritime: Board). i 

A*word of caution on figures given as shipping company earnings of 
foreign exchange: The early British figures (p. 43) really refer to total earn- 
ings by British shipping companies in carrying exports and imports, as well 
as inter-third country trade, and are therefore not all additions to foreign 
exchange. Of course, the British method of c.i.f. valuation of imports com- 
pensated for the overstatement of transportation credits in the balance of 
payments, The 1949 shipping earnings figure, calculated on an f.o.b. basis, 
is not comparable to the earlier figures. Similar remarks apply to the Nor- 
wegian data cited on p. 42. Furthermore, in singling Norway out as the 
country for whose economy shipping is ol the greatest importance, the ratio 
of shipping foreign exchange earnings to national income would have been 
a more significant criterion than net earnings figures alone or their ratio to 
the trade balance. 

On the whole this is a very fine book on a much-neglected subject. A 
wealth of information has been pieced together from books, documentary 
reports, government agencies, and case studies. It fills a need in the literature 
of international cartels and transportation. Students of economics, govern- 
en ad transportation will find it very useful. Since the author can already 
point out some comparisons with the international aviation field, I hope that 


he may someday also publish a treatis m R А 
е te: tels 
on the same high level. tise on international air transport car! 


PUBLICATIONS RECEIVED 


Actuaries, Society of. 1951 Impairment 
Study. 1954. $7.50. 

Arizona, University of, Agricultural Ex- 
periment Station. Barriers to the ¥nterstate 
Movement of Milk and Dairy Products in 
the Eleven Western States. Tucson, April 
1954. Paper. 

Barclay, George W. «Colonial Develop- 
ment and Population in Taiwan. Princeton: 
University Press, 1954. $5.00. 

Blackwell, David, and Girshik, M. A. 
Theory of Games and Statistical Decisions. 
New York: John Wiley and Sons, 1954. 
$7.50. 

Bovet, Eric. L'organisation rationelle de 
la distribution moyen de stabilisation écon- 
omique, Paris: Delachaux et Niestlé, 1954. 
Frs, 12.50. Paper. 

Burington, Richard Stevens, and May, 
Donald Curtis, Jr. Handbook of Probabil- 
ity and Statistics with Tables. Sandusky, 
Ohio: Handbook Publishers, Inc., 1953. 
$4,50. 

California, Department of Industrial Re- 
lations. Union Labor in California, 1953. 
San Francisco, 1953. Paper. 

Cameron, Burgess. The Determination 
of Production. New York: Cambridge Uni- 
versity Press, 1954. $3.75. 

, Cavé, R. Le control statistique de fabrica- 
tions. Paris: Eyrolles, 1953. 

Chevalley, Claude C. The Algebraic The- 
ory of Spinors. New York: Columbia Uni- 
versity Press, 1954. $3.75. 

Clem, Mary A., and Federer, W. T. Ran- 
dom Arrangements for Lattice Designs. 
Ames: Iowa State College Press, 1950. 
$1.00. Spiralbound, paper. 

Federer, W. T. Random Arrangements 
for Some Three-Dimensional Lattice De- 
signs. Ithaca, N. Y.: Cornell Agricultural 
Experiment Station, 1953. $2.50. Spiral- 
bound, paper. 1 

Furmin, Peter A. The Michigan Business 
Receipts Taz: An Appraisal, Ann Arbor: 
University of Michigan, 1953. $2.00. Paper. 

_ Ghosh, M. K., and Prekash, Om. Prin- 
ciples and Problems of Industrial Organiza- 
tion. Allahabad: The Indian Press, Ltd., 
1953. Rs. 11. bd 

Gilbert, Milton, and Kravis, Irving B. An 
International Comparison of National 
Products and the Purchasing Power of Cur- 
Tencies: A Study of the United States, the 
United Kingdom, France, Germany, and 
Italy. Paris: OEEC, 1954. $3.00. 

Henderson, W. D. Britain and Indestrial 
Europe, 1760-1870: Studies in ver л 
fluence on the Industrial Revolution in West- 


ern Europe. Liverpool: University Press, 
1954. 25 s. 

Herskovits, Melville J. Franz Boas: the 
Science of Man in the Making. New York: 
Charles Scribner’s Sons, 1953, $2.50. 

Hoel, Paul G. Introduction to Mathe- 
matical Statistics. Second Edition. New 
York: John Wiley and Sons, 1954. $5.00. 

Hooker, P. F., and Longley-Cook, L. H. 
Life and Other Contingencies, Vol. 1. New 
York: Cambridge University Press, 1953, 
$4.50. 

International Labour Office. Yearbook of 
Labour Statistics, 1953. Thirteenth issue. 
Geneva, 1953. $5.00. Paper. 

Liebenstein, Harvey. A Theory of Eco- 
nomic-Demographic Development. Prince- 
ton: University Press, 1954. $4.00. 

Lieblein, Julius. A New Method of Ana- 
lyzing  Exireme-Value Data. Technical 
Note 3053. Washington, D. C.: National 
Advisory Committee for Aeronautics, 1954. 
Paper. 

Macdonald, Gordon D. Apartment Build- 
ing Construction Manhattan 1902-1968. 
New York: The Real Estate Board of New 
York, Inc., 1953. $5.00. (To members of the 
National Association of Real Estate Boards 
and other professional associations $1.00.) 

Macdonald, Gordom D. Office Building 
Construction Manhattan Joor IN ue 
supplement number one. New York: The 
Real Estate Board of New York, Ine., 
1953. $10.00. (To members of the Na- 
tional Association of Real Estate Boards 
and other professional associations $5.00.) 

Marcus, Edward. Canada and the Inter- 
national Business Cycle, 1927-1989. New 
York: Bookman Associates, 1954. $3.75, 

Marvick, Dwaine. Career Perspectives in 
a Bureaucratic Setting. Michig&n Govern- 
mental Studies No. 27. Ann Arbor: Uni- 
‘versity of Michigan, 1954. $2.25. Paper. 

Mauldin, W. Parker, and Akers, Donald 
S. The Population of Poland. Washington, 
D. C.: U. S. Government Printing Office, 
1954. $1.00. Paper. 

Maxwell, E. A. An Analytical Calculus, 
Vol. I, Vol. II. New York: Cambridge Uni- 
versity Press, 1954. $2.75, $3.50. 

Mitchell, Robert B., and Rapkin, Chester. 
Urban Trafic: A Function of Land Use. 
New York: Columbia University Press, 
1954. $5.00. 

Modley, Rudolf, and Cawley, Thomas, 
jr. editors. Aviation Facts and Figures. 

"Washington, D. C.: Linceln Press, Inc., 


1953. 
ronal Bureau of Standards. Tables of 


679 


_ 680 
Circular and Hyperbolic Sines and Coswnes 
for Radian Arguments. Applied Mathe- 
‘matics Series 30. Washington, D. C.: U. 8. 


for Sexagesimal Interpolation. 
Mathematics Series 35. Washington, D. C.: 
U. S. Government Printing Office, 1954. 
$2.00. 

National Industrial Conference Board, 
Inc. The Economic Almanac, 1963-1964. 
New York: Thomas Y. Crowell Company, 
1953. $2.95, 

Neifeld, M. R. Trends in Consumer 
Finance. Easton, Pa.: Mack Publishing 
Company, 1954, $6.00. 

New York, State of, Department of 
Labor. Employment Patterns of Insured 
Workers in Selected New York Industries, 
1947-1951. New York: Division of Re- 
search and Statistics (1440 Broadway), 

. 1953. Paper. 
Phelps, Clyde Williams. Instalment Sales 
Financing: Its Services to the Dealer. Balti- 
more, Md.: Commercial Credit Company 
(14 Light 86.), 1953. Paper. 

Prest, A. R., assisted by A. A. Adams. 
Consumers! Expenditure in the United 
Kingdom, 1900-1919. New York: Cam- 
bridge University press, 1954. $7.50. 

Richardson, C. H. An Introduction to the 
Calculus of Finite Difference. New York: 
D. Van Nostrand Со. Inc., 1954. $2.50. 

fnlzer, Herbert E. Tables of Coefficient 
for the Numerical Calculation of Laplace 
Transforms. Applied Mathematics Series 
30. Washington, D. C.: U. 8. Government 
Printing Office, 1953. 25 cents. Paper. 

Schutzenberger, M. P. Contribution aux 
applications statistiques de la théorie de 
l'information. Vol. III, Fascicules 1-2, 
1954. Paris, Institut Henri Poincaré. Paper. 

' Schultz, William J., and Reinhardt, Hed- 
wig. Credit and Collection Managemint, 
Second Edition, New York: Prentice-Hall, 
Ino., 1954. $9.00, 

. Spurr, William A., Kellogg, Lester S., 
and Smith, John H. Business and Economic 
Statistics. Homewood, Ill.: Richard D. Ir- 
win, Inc., 1954. College price, $6.00. 
Tanganyika. Report on the Census of the 
Non- Native Population Taken on the Night 

of 26 February 1948. Dar Es Salaam: Gov- 
ernment Printer, 1953. 25/. Paper. 
DN. Institute, Inc. Federal-State-Local 
„ G symposium volume. 
Princeton, N. J., 1954. $5.00. 
рр d Report on the Cen- 
вс Non- Population of Uganda 
Protectorate Tåken on the Night of 26th” 


zt February, 1948. Nairobi, 1953, Paper. « 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER 1054 


United Nations, Department of Economic 
Affairs. Economic Survey of Asia and the 
Far East, 1953. Bangkok, February 1954. 
$1.50. Paper. 

. Economic Survey of Europe in 
1953 including a Study of Economic Develop- 
ment in Southern Europe. Geneva, 1954. 
$2.50. Paper. 

. A Study of Trade between Asia 
and Europe. Geneva, 1953. $1.50. Paper. 

United Nations, Department of Social 
Affairs, Population Division. Additional In- 
formation on the Population of Tanganyika. 
New York: 1953. 30 cents. Paper. 

United Nations, Economic Commission 

for Asia and the Far East. Mobilization of 
Domestic Capital Report and Documents of 
the First Working Party of Experts. New 
York: Cambridge University Press, 1952, 
$1.50. Paper. 
. Mobilization of Domestic Capital: 
Report and Documents of the Second Work- 
ing Party of Experts. Bangkok, 1953. $2.50. 
Paper. 

United Nations, Economic Commission 
for Europe. Annual Bulletin of Transport 
Statistics 1952. Geneva, 1952. $1.25. Paper. 

United Nations, Statistical Office of. 
Concepts and Definitions of Capital Forma- 
tion. Statistical Papers, Series F, No. 3. 
New York: 1953. 25 cents. Paper. 

. Demographic Yearbook 1968. 
New York, 1953. $5.00, paper; $6.50, 
cloth, 


„ International Migration Statistics. 
Statistical Papers, Series M, No. 20. New 
York, 1953. 25 cents. Paper. a 
. Population and Vital Statistics 
Reports. Statistical Papers, Series A, Vol. 
VI, No. 1. New York, 1954. 30 cents. Paper. 
. Real Price Comparisons for 
International Salary Determination. Statis- 
tical Papers, Series M, No. 14 add. 1. New 
York, 1953. 30 cents. Paper. 
. Statistical Yearbook 1958. New 
Work, 1953. $6.00, paper; $7.50, cloth. 
. Statistics of National Income а 
Expenditure. Statistical Papers, Series H, 
Nos. 4 and 5. New York, 1953. 60 cents and 
80 cents. Paper. 
. A System of National Accounts 
and Supporting Tables. Studies in Methods, 
No. 2. New York, 1953. 50 cents. Paper. 
United Nations, Technical Assistance 
Programme. The Economic and Social De- 
velopment of Libya. New York, 1953. $1.75. 
Paper. E 
Vicary, James M. An Annotated Bibliog- 
raphy of Word Association References to 
Marketing Researchers. New York: James 
Mr Vicary Company (22 E. 60th St.), 1954. 


PUBLICATIONS RECEIVED 


Free. Mimeo. 

Waugh, Frederick V. Reading on Agricul- 
tural Marketing. Ames: Iowa State College 
Press, 1954. $5.00. 

Wert, James E., Meidt, Charles O., and 
Ahmann, J. Stanley. Statistical Methods in 
Educational and Psychological Research. 
New York: Appleton-Century-Crofts, Ine. 
1954. $5.00. 

Whelpton, Pascal K. Cohort Fertility: 
Native White Women їп the. United States. 
Princeton: University Press, 1954. $6.00, 
Offset. 


Wilks, S. S. Analists Estadistico Ele- - 


mental. Traduccion de Sixto Rios y Jose 
Royo. Madrid; Consejo Superior de Inves- 


5 681 


tigaciones Científicas, Departmento de 
Estadistica, 1952. Paper. 

Wold, Herman. A Study on the Analysis 
of Stationary Time Series. Second Edition. 
Stockholm; Almquist and Wiksell, 1954. 
Sw. kr. 38. 

Wolfenden, Hugh H. Population Statis- 
tics and their Compilation. Chicago. Univer- 
sity Press, 1954. $7.50. 

Woytinsky, W. S., and Woytinsky, E. S. 
World Population and Production: Trends 
and Outlook. New York: The Twentieth 
Century Fund, 1953. $12.00. 

Žarković, S. S., Greške popisa stanov- 
ništva. (Population Census Errors.) Beo- 
grad, 1954, Paper. 


RANDOM DIGITS (17,376-20 ,875) 


From A Million Random Digits to be зеранд for The Rand Corporation toward the end of 1954 by 
Free Press, Glencoe, Illinois. 


04808 99531 47991 46064 80467 
71924 64882 94893 82935 99076 
56410 89552 28404 74525 74212 
38851 16144 99542 27481 21992 
91428 10589 09454 43308 66753 
40083 17141 30702 31997 69856 
93419 10474 41796 88285 02448 
03704 65516 65448 20203 21189 
78181 90060 74904 42627 16638 
45972 93572 76011 03426 50226 
60898 63968 62264 64603 51866 
40398 54180 65869 87977 02799 
68245 76912 01222 59516 36438 
27019 15248 66444 25267 05171 
99868 88894 43769 52239 05919 
87904 74135 53842 59520 23979 
68851 41049 97190 53984 04773 
71742 , 57223 , 66599 86071 01901 
02742 48803 17823 22093 43907 

* 56181 96052 67211 61712 54590 
55355 61548 55988 47309 23749 
78961 41072 09876 18903 30292 
92654 97226 , 58434 71025 63892 
13757 37719 84450 02697 60309 
05776 85945 74651 00216 50842 
71029 83083 , 60427 78495 99809 
61672 01184 46438 27698 40652 
42088 77983 5870! 42176 67356 
13652 16640 2789 26907 86760 
53186 97859 97213 19859 41037 
47890 10690 26486 38744 25943 

^ 65654 34629 88831 97253 67282 
00324 17120 39900 67135 42712 
48244 26191 88421 90491 83290 
64081 47104 15018 45600 17241 
60617 06414 56596 63011 24193 
72860 18452 42983 23931 11789 
04031 . 55283 19605 . 34163 86540 
06884 15444 209310 7 17048 24243 
26611 09551 82626 ^ 38104 58432 


682 


RANDOM DIGITS 


58932 
73073 
42665 
59985 
50943 


22224 
24473 
38582 
46094 
91061 


00397 
14328 
88534 
97347 
01366 


37106 
06476 
81717 
51583 
50120 


89761 
08943 
71685 
17402 
52606 


66035 
21565 
88735 
50404 
80834 


26872 
16530 
84644 
88620 
22209 


04795 
54291 
30654 
11123 
56577 


58987 
16851 
02104 
54440 
87681 


24337 
62557 " 
02913 
68706 
05930 


33491 
89301 
77622 
60562 
63035 


91576 
76920 
14672 
91838 
73729 


53158 
72952 
68614 
73087 
01868 


21584 
97122 
94516 
41758 
83655 


77480 
11057 
79368 
94385 
92127 


76264 
45403 
03080 
28017 
93109 


79021 
17329 
86828 ° 
94716 
68615 


14592 
616835 
218339 
49393 
83291 


42969 
70476 
77706 
31618 
69847 


52574 
72781 
58822 
13846 
78416 


04617 
74066 
15779 
85747 
60344 


16781 
88864 
93362 
79574 
99315 


71872 
27048 
83073 
77135 
51667 


93712 
44978 
15427 
55004 
88345 


28683 
98849 
43710 
01717 
42588 


29148 
33782 
71653 
52611 
9185% 


51571 
87959 
50552 
84622 
*58113 


39634 
32186 
65024 
12911 
12329 


59144 
77118 
18924 
35707 
81848 


83649 
17186 
82941 
56197 
00194 


* + my * on Е » 
* Ы * Ез; * LE 


3 " p" 
684 AMERICAN STATISTICAL ASSOCIATION JOURNAL, SEPTEMBER, 1054, ? 


10119 31347 12659 11574 70052 
98390 30240 28330 41145 16918 
08172 23823 48433 57222 34435 
21238 19051 50768 40807 88681 
79342 44640 93942 97371 16842 
93039 79367 00812 41365 04515 
62865 09576 97207 33739 78345 
00800 72496 24767 61768 07228 
64340 02224 48336 14801 72188 
92168 52692 31224 12185 43065 
20494 18813 16242 40257 66402 
87693 30242 10545 69128 51528 
05567 05561 82071 07234 67690 
85166 37189 75671 33879 27411 
26704 41922 56650 40236 66207 
01047 81624 71395 62310 41501 
58183 21952 84098 28913 55736 
64667 57092 21315 04731 71877 
27149 13843 09817 09407 88276 
66232 80293 74502 36925 60184 
40500 21406 00571 87320 81683 
35892 49668 83991 72088 30210 
54819 26094 51409 21485 94764 
64224 47909 09994 23750 17351 
36913 58173 45709 83679 82617 
64254 64745 10614 86371 43244 
82018 » 95536 ^ 74031 31807 70133 

i» 28833 44043 96215 21270 59427 
96879 27659 95463 53847 40921 
95938 ^ 76014 99818 16606 19713 
97154 71237 06073 57343 51428 
78790 17026 59008 28543 11576 
25034 59325 ` 08844 95774 49323 
70116 44091 88505 15575 44927 
66904 23000 73259 68626 98902 
91171 28299 * 62619 81550 46798 
74547 13260 79262 55831 83784 
30448 14154 15195 39465 82353 
06584 20867 45898 66415 89349 
68548 86576 14344 75889 04514 
49319 50206 22024 56124 50749 
81034 86779 34622 70859 33045 
68905 44234 18244 31602 38388 

. 88530 72096 44459 31449 93182 
37227 11302 04667 32526 64713 
83220 50529 20619 11606 10297 

5 eu 30017 35347 35038 16648 

pe 556 76728 60535 59961 76979 
9040 ., 96390 65989 >- 38375 30332 


85185 72849 58611. 31220 66108 


vow. * Ы ^ * " 
Ta 


JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Number 268 DECEMBER 1954 Volume 49 


METHODS OF APPORTIONING SEATS IN THE HOUSE 
OF REPRESENTATIVES 


WALTER F. WinLcox* 
Cornell University 


HE subject of Congressional Apportionment was transferred a few 

years ago to the Judiciary Committees of Congress; no member of . 
either committee, I believe, was in Washington in the late twenties 
when the long fight over the question ended with the dangerous in- 
crease in the size of the House halted and with reapportionment made 
a ministerial act about which Congress need no longer concern itself 
after each census as it had done for 130 years. 

In view of that situation the Judiciary Committee of the House 
asked me to report to them on methods of.apportionment and to in- 
clude so much of the congressional history of the subject as might һар 
them in dealing with it. My reply to the committee was sent some 
months ago; it ended by proposing two amendments to the automatic 
act of 1929, one of them changing the present method to a novel one 
which I now think the best, the other Ібрріпр off ten members auto- 
matically after each future census from an overgrown House until it 
approaches & membership of three hundred, the number mentioned 
often in congressional debates as a desirable goal. This article will 
supplement my report to Congregs and lay before scholars the main 
results of my work which has lasted for more than half a century, a 
period during which some of my early conclusions have changed and 
several new ones emerged. i 

Ihad to prepare gpportionment tables for Congress after the census 
of 1900. After the following census two groups of students entered the 
field. One, for whom Allyn A. Young spoke, held that the question of 
method “is mathematical and ought to be decided upon the basis of a 
general consensus of mathematical opinion.” The position of the other 
group was stated thus: “The questionsinvolved in the apportionment of 

* Editor's Note: ProfessorgWilleox was Président ofthe American Statistical Association in 1912 


nd of the American Economie Association in 1915. i 
1 Allyn Young, circular letter to statisticians and mathematicians dated Nov. 19, 1926 (unprinted). 


* 685 * Sate care 


686 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


representatives is primarily one of constitutional law. The role of 
mathematics in the problem is to make clear the consequences of any 
given interpretation of the Constitution.” 

This article will defend the second position. My understanding of the 
difference between the two groups was stated in a letter to an oppo- 
nent: “Our conference left me with the impression that all the men 
present except Professor X and me believed that there is only one right 
method of apportionment, the method of equal proportions. ... Му 
view is that the methods can be arranged in an order of preferability, 
that their sequence in that order depends upon the predominant object 
to be secured by apportionment, that the object of apportionment isa 
political rather than a mathematical problem and one to be deter- 
mined, therefore, not by academic students but by Congress."* 

Those who wrote the Constitution intended, I believe, to make the 
resident of a state the unit of representation in the House, as they had 
made the state itself the unit in the Senate. The Constitution contains 
three passages which bear on the method by which to carry out that 
intention. They are: 

(1) *The number of Representatives shall not exceed one for every 
thirty thousand." 

(2) “Representatives shall be apportioned among the several States 
according to their respective numbers.” 

(3) “Each State shall have at least one Representative." 

The first passage alone determined the method of apportionment 
used by Congress before 1840, the second alone has underlain the four 
methods used since 1840, the second modified by the third furnishes à 
basis for the novel method proposed herein. My arguments are ad- 
dressed primarily to Congress because in deciding this question Con- 
gress is the jury, but I cannot succeed in that quarter without suppor t 
from the publie, hence this article. , 

The House of Representatives at first had only 65 members, but 
Congress soon became convinced that it should be enlarged as much as 
possible, The limit on size set in the first passage proved to be ambigu- 
ous; for five months Senate and House disputed over how to interpret 
it. Did it mean not more than one for every thirty thousand in each 
state, 112 in all, or not more than one for every thirty thousand in all 
the states, 120 in all? The House sent on to the Senate a bill apportion- 
ing 112 representatives, The Senate returned it after adding eight seats 


? F. W. Owens, “On the Apportionment of Representatives,” 7 istical Association 
Publications, 17 (1921), p. 968. ut ee eee American Statistici 


3 Walter F. Willcox, circular letter of Aug. 12, 1931 (unprinted). 


5 


APPORTIONING SEATS IN THE HOUSE OF REPRESENTATIVES 687 


for large remainders and in that form it reached the President. Wash- 
ington vetoed it, relying on the advice of Jefferson, then Secretary of 
State and in charge of the census, that the method was unconstitu- 


- tional. That veto establishéd the method which Congress used for half 


a century; it may be called the Jefferson method or method of rejected 
fractions.‘ Its essential feature is that it apportions representatives only 
for units. Я 

This method was abandoned forty years later as a result of Webster’s 
criticism. Under the discarded method the larger a state the smaller 
the proportion between its rejected fraction and its population, hence 
the smaller its district population and the more representatives it 
would get. Under 1950 figures, for example, the method of rejected 
fractions would have transferred to states above the average size nine 
seats which the present method gave to states below the average size. 

Among the tests of a method one needs to be mentioned here because 
it underlies Webster’s method. First divide the states into three groups, 
large, small and very small, the line separating large and small being 
the average population of those two groups, and the line separating 
small and very small being whether a state gets its one seat for popula- 
tion or by constitutional guarantee. Next compute the average district 
population of each of the two groups оѓ Јагре and spall states. The 
nearer those averages are to each other, the better the method. « 

"This test also measures the nearness to equal representation given to 
one person or one million persons wherever the residence; that is prob- 
ably what those who wrote the Constitution had in mind. 

Webster's proposal was to give supplementary seats for fractions 
larger than one half, “major fractions.” An apportionment bill had 
passed the House after the 1830 census and been referred tg his com- 
mittee. He reported that the method of rejected fractions was uncon- 
stitutional because it did not mget ‘the second constitutional require- 
Ment and apportion seats to the several States according to their re- 
spective numbers. Webster computed the number of representatives a 
few states would have if the proportion each had of 240 was as near 


8s possible to the proportion of its population to that of all the states. 
"He showed that the population of New York and Vermont, for example, 


Would entitle the former to 38.59 representatives and the latter to 5.65 
and claimed that New York should receive 39 seats and Vermont 6 


‘instead of the 40 and 5 given them in the bill. 


Tf he had been able to shoy Congress that the current method must 


= 
* Jefferson wrote in a mémorandum for which Washington had asked, “Fractions must be neglected 
because the Constitution . . . has left them unprovided for.” Wrtings. (1940 ed.), vol. 3, pp. 201-11. 


688 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


overrepresent a large state and underrepresent a small one, his argu- 
ment might have carried greater weight. He might have said that by 
it the average district population in large states would be much below 
that in small ones, and that, if Congress sould adopt his amendment, 
the difference would be only one sixth as great. Stated in a way perhaps 
more meaningful to Congress, his method would have transferred 
seats from three large states, Kentucky, Pennsylvania, and New York, 
to three small ones, Delaware, Missouri and Vermont. 

He carried the Senate, but when the House rejected his amendment 
the upper house yielded. His failure may have been due even more to 
a weakness in his mathematics which I have explained elsewhere and 
which I was able eighty years later* to correct. 

Webster’s contributions were that he called attention to the second 
constitutional requirement, showed that the results of the current 
method violated it and that to apportion supplementary seats for frac- 
tional remainders larger than one half would be a great improvement. 
But he did not answer either of two questions the first of which long 
vexed Congress. That is, How is the common divisor which his method 
needed to be found? The other is, How is the problem affected by the 
third requirement in the constitution? 

In the period, between 1832 and 1910 Congress tried to adapt Web- 
sten’s revolutionary idea to a problem the shape of which was changing. 
In 1842 Congress experimented with his method. The law specified the 
common divisor which he did not know how to compute for a specified 
number of seats and provided for “one additional Representative for 
each State having a fraction greater than one moiety of the said ratio,”® 
but it did not specify the size of the House. When the 1850 census was 
at hand Congress was minded to stabilize the size of the House at 233 
members by a ministerial apportionment. The law instructed the Secre- 
tary of the Interior, first, to divide; the combined population of the 
states by 233, then to divide the population of each state by the quo- 
tient, and finally to apportion one seat for each unit and enough sup- 

plementary seats for large fractions to reach the required total. 

This method dodged one of the difficulties in Webster’s proposal, 
how to find the common divisor, but two others remained. The method 
might yield one or more seats for large minor fractions, or withhold 
seats for one or more small major fractions. As long as the House did 
not increase in size the method worked well, but after the 1870 census 


А 7 5 
^ vel d mI Words on the Apportionment Problem," Law and Contemporary Probleme, 
17, М 


* Statutes at Large, vol. 9, p. 432. 


APPORTIONING SEATS IN THE HOUSE OF REPRESENTATIVES 689 


Congress resumed the poliey of apportioning enough seats to keep 
every delegation intaet and the experience of forty years with that 
change wrecked the method. The Superintendent of the Census began 
to send Congress tables based on the 1850 method which showed the 
distribution of each number of seats within the limits of size which 
interested the House. These tables showed that now and then one or 
more states might'receive a seat for a large minor fraction, one or 

- more might fail to receive a seat for a small major fraction, or when 
one member was added to the total two states might gain and a third 
lose a seat (the “Alabama paradox"). 

As I have said, my connection with the problem began in 1900 when 
I was placed in charge of a division in the Census Bureau which was to 
prepare the apportionment tables. The law said we should use the Vin- 
ton method, but, as tables based on earlier figures had shown its weak- 
ness, we submitted a second set based on an imperfect understanding 
of the Webster method. 

The sizes of the House which interested Congress were 357, its exist- 
ing size, and 386, the size at which no state would lose a seat. Our 
Vinton tables contained two examples of the Alabama paradox at 
just these numbers, one affecting Colorado, the other Maine. For Colo- 


rado, the figures ran: . е 
Size of House 856 867 858 « 
Seats for Colorado 3 2 ae) 
For Maine they were: 
Size of House 384 e 885. 886 387 
Seats for Maine 4 4 3 4 


Congress got over the hurdle and apportioned 386 seats by starting 
with the table for 384, which contained two quotients with major frae- 
tions for which no seats were-apportioned. For each of these quotients. 
Congress gave an extra seat and thus reached 386, the number desired, 
with no state losing a seat and no major fraction unrewarded. 

After following the long House debate I returned to Cornell sure 
that the principle of the Webster method was sound but its mathe- 
matics weak. Once the difficulty was grasped, the solution was obvious; 
apply the sliding divisor concept. After the 1910 figures had been 
announced I took to Washington a set of tables and an explanatory 
letter in which I had written: š 

“The history of reports, 4lebates, and votes upon apportionment 
seems to show a settled convietion ip Congress that every major frac- 
tion gives a valid elaim to an additional Representative. . . . The pres- 


690 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


ent method is based upon that conviction and seeks to facilitate action 
in conformity with it. Because of this feature Ihave called it the method 
of major fractions. 

“The results are simple, but the method itself is somewhat difficult 
to explain. If a ratio of 240,000 persons to each Representative be 
assumed arbitrarily as a starting point, that number divided into the 
population of each state and one Representative assigned for each 
whole number and each major fraction in the series of quotients, a total 
of 383 Representatives is reached. If the ratio be then diminished by 
10 to 239,990, no difference in the apportionment will result, but the 
decimal in each quotient will be slightly increased. If the ratio be fur- 
ther reduced to 239,980, 239,970, etc., the decimals continue to in- 
crease with each change of ratio, but with varying rapidity. It is a 
simple problem to compute in which State the decimal will first pass 
.500 and become a major fraction and at just what ratio the change 
will occur. In the present case the State whose decimal first reached 
500 is Illinois and the corresponding ratio is 239,940, . . . which has 
been called the boundary ratio.” 

The Bureau of the Census handed Congress two other sets of tables, 
one based on the prescribed Vinton method, the other on a novel 
mathematical analysis of the problem devised by my successor in 
Washington and perfected later by Professor E. V. Huntington. The 
1911 apportionment was based on the Cornell tables. 

The figures of the 1920 census showed that the Huntington method, 
or method of equal proportions, would give three seats to small states, 
(Rhode Island, Vermont and New Mexico) which the Webster method 
would give to large ones (New York, North Carolina and Virginia). 
This difference in the results was due to the fact that the Huntington 
method ‘makes the “critical fraction” separating large remainders for 
which seats are apportioned from small ones for which they are not à 
variable one lying always below .500 and above .414. The following 
computation shows why I preferred the results of the Webster method. 


States Deviation from Population Standard by 
Webster method Huntington method 
3 large states +1.15 —1.85 
3 small states —1.41 +1.59 
Total 2.56 3.44 


The total deviation from the standard is one third greater by the 
Huntington method than by the Webster method. 


7 61st Congress, 3d Session, House Report No. 1911, Jan. 13, 1911, p. 9 f. 


- APPORTIONING SEATS IN THE HOUSE OF REPRESENTATIVES 691 


ў The two methods applied to 1930 figures yielded identical results. 

_ Ten years later Congress abandoned the Webster method and adopted 
_ the Huntington one largely because it would give to Arkansas a seat 
which the Webster method, would have given to Michigan. After the 
1950 census the Huntington method gave to Kansas a seat which the 
Webster method would have given to California, Analysis of these 
instances shows how the two tests worked. 

Professor Huntington described his test in these words: “Test of 
equal proportions—A transfer of a seat from one state to another 
should be made if, and only if, the percentage difference between the 
congressional districts? in the two states would be reduced by the 
transfer.”® 

This leads to the following figures (in thousands) 


Before transfer After transfer 
State No. of District No. of District 
seats Population seats Population 
California 31 353 30 341 
Kansas 5 381 6 318 
Disparity 8.0 per cent 7.5 per cent, 


Because 7.5 per cent was less than 8.0 per cent, the Huntington test 
gave the seat to Kansas. I cannot see how’that test bears on the prob- 
lem it tries to solve. е 

If we compute the number of seats and fractions«of a seat to which 
the two states were entitled in 1950 California should have had 30.68 
and Kasas 5.52. The Huntington method in giving California 30 seats 
| curtailed its representation by .68 and in giving Kansas 6 seats swelled 

it by .48, a total departure from the standard of 1.16. The Webster 
method in giving California 31 seats would have swollen its representa- 
| tion by .32 and in giving Kansas 5 would have curtailed it by .52, а 
- total departure from the standart of .84, about three fourths as much. 

Another form of comparing the methods may be clearer. Each 
method gave the two states together 36 representatives. The question 
is, should California have received 31 and Kansas 5, or California 30 
and Kansas 6? Simce California had 84.75 per cent of the population of 
the two states it should have that per cent of the 36 seats, in other 
words 30,51 seats for it and 5.49 for Kansas; so California had the 
stronger claim to the transferable seat. 

Cumulative evidence comes from applying each test to the 1940 fig- 
——————————————.——————————— 

* He meant by congressional districte what I prefer to call district. populations. 


? E. V. Huntington, “Methods of AppSrtionmefft fn Congress,” in 76th Congress 3d Session 
Senate Document No. 304, ф, 3. 1 


692 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


ures for Michigan and Arkansas. Each method gave those states in 
combination 24 seats; the question was, should Michigan receive 17 
and Arkansas 7, or Michigan 18 and Arkansas 6? Michigan was en- 
titled by its population to 17.43 seats and Axkansas to 6.04 so Michigan 
with 17 was curteiled by .43 and Arkansas with 7 was strengthened 
by .96, a total deviation of 1.39. But if Michigan had received 18 seats 
and Arkansas 6 the total deviation would have been only .61, (.57 
and .04) less than half as much. Michigan had 72.95 and Arkansas 
27.05 per cent of the population of the two states, so the former was 
entitled to 17.51 and the latter to 6.49 out of the 24 seats. Evidently 
Michigan should have been given the transferable seat. 
The argument thus far has indicated that Congress erred when it 
adopted the Huntington method and abandoned that of Webster. 
We come to the last question, What effect has the requirement, 
“ach State shall have at least one Representative” on the problem? 
That question but not its answer I saw through a glass darkly when I 
wrote, “The Vinton method .. . involves a fundamental theoretical 
error.!° It overlooks the crucial fact that seats in the House of Repre- 
sentatives are of two classes, the 48, one for each state, which are 
guaranteed by the Constitution and are as completely beyond the con- 
trol of Congress as the seats of the Senators are, and the remainder, 
the number and distribution of which are under congressional control. 
The two classes might be named the apportionable and the unappor- 
tionable seats. The fact that they are not individually distinguishable 
has apparently been responsible for the failure to recognize their exist- 
ence. To get this theoretical requirement clearly in mind it may be 
helpful to think of the seats in the House of Representatives as num- 
bered. The first, 48 seats, one for each state, would be numbered one 
_ to indicate that there is no basis-for distinguishing between them. The 
next seat, numbered 49, would be:apportioned to New York, number 
50 to Pennsylvania and so on". к 
This distinction has now been recognized by all authorities and ap- 
portionment tables give the states to which seats go in succession from 
Nc. 49 on to No, 435. Both the present and the proposed method under 
1950 figures would give seats 49, 50, 51, and 52 to the largest states, 
New York, California, Pennsylvania and Illinois, in the order of their 
size but differ about seat 53. The present method gave that seat to 
New York with a district population (in thousands) of 7,415 although 
M nn I realized that the Webster method in aither form and the Huntington method involve the 


u Walter F. Willcox, “The Apportionment or Б АДЕЛ B л 
Р » 5 itat B the President, 
American Economic Review, 6 (1916) Supplement, pain Ea edges 


д) 


"ти 


APPORTIONING SEATS IN THE HOUSE OF REPRESENTATIVES 693 


Ohio, the fifth state in size, had a larger district population, 7,947, and 
in my judgment should have received the seat. If Congress should 
agree with me on that point, its choice would entail a decision that the 
method of included fractions leads to better results than the method 
of equal proportions and should displace it at the first opportunity. 

In ending the argument about methods of apportionment we may 
summarize the conclusions. 

Methods are of two kinds, primary and secondary, the former finding 
a root in the Constitution, the latter not finding such a root. 

There are three primary methods, one based on “not more than one 
in every thirty thousand,” another based on “according to their re- 
spective numbers,” and a third based on the attempt to combine the 
second requirement with the last, “Each State shall receive at least one 
Representative.” 

The results of these three methods differ in that the first apportions 
no supplementary seats for remainders, the second apportions such 
seats for remainders larger than one half, the third like the first, appor- 
tions no seats for remainders but unlike it assigns one seat to each 
state before apportionment begins. | 

The results also differ in their distribution of transferable seats, the 
number of which has increased from two after the first census to sixteen 
after the last. The first method distributes all transferable seats among 
the large states, the second divides them as evenly as possible between 
the large and the small states, the third distributes'them all among 
the small states. 

The Constitution seems to leave witk Congress a choice between 
two tests and two methods of apportionment. The first test would be 
based on the nearness of two proportions, on the one hand the propor- 
tion that a state's population makes of the population of all th® states, 
and on the other the proportion that the number of a state's repre- 
sentatives makes of the whole number of representatives. The second 
test would be based on the nearness of the district populations of the 
48 states to one another. 

The number of methods of apportionment giving different results 
varies with the figures of a census but is always one more than the 
number of transferable seats. The seventeen results possible after the 
1950 census would come from three primary and fourteen secondary 
methods; these seventeen methods make up a series wherein the results 
of each would differ from thosg on eifher side of it by transferring one 
seat from a large state to a small one of vice versa. 

The methods reaching these results use a series of seventeen divisors 


. є SAT 
. 


694 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


with 329,577 at one end and 365,394 at the other. They use also a 
series of seventeen critical fractions with zero at one end and one at the 
other.” l 

The article may close with a few words about another amendment 
now before the House Judiciary Committee. It provides for a slight 
automatic reduction in the size of the House after each future census. 
The law now declares that the President shall report to Congress after 
each census the result of redistributing the then existing number of 
members, among the states according to the method last used by Con- 
gress. The amendment would insert the words, “ten less than” before 
the words, “the then existing number.” 

The House is now about four and a half times as large as the Senate; 
at the start it was only two and one half times as large. In state legis- 
latures the difference between the size of the two branches is much 
less, and, since as a rule the more recent a state constitution is, the 
nearer to equality in size are the upper and lower houses, it would seem 
that American experience has led to a reduction of the average differ- 
ence. 

More important evidence comes from members of the House. Six- 
teen years ago a Congressman acting at my suggestion asked each of his 
colleagues whether he was satisfied with the present size of the House 
and, if not, whether he wanted it larger or smaller. Among the one 
third who replied, one half were satisfied with the present size of the 
House; of the other half about nine tenths wanted it smaller. 

More evidence to the same effect came from the apportionment de- 
bate in the 1920’s; then fifteen experienced and influential representa- 
tives gave an opinion; all but two wanted the House smaller. Probably 
no one now a member can recall, as Congressman Burton then could, 
when it Was only three quarters as large. He had begun his long service 
more than forty years before and had been for six years a Senator. He 
said: «Т began when there were 325 members of this body, and the 
disadvantages in the transaction of business now as compared with 
then are beyond my powers to describe. It is not only the greater ex- 
pense but the greater confusion on the floor and the greater difficulty 
in the orderly transaction of work. ... I would rather see this House 
consist of 300 members than 435.” 

With the size of the House stabilized as it has'been for fifty years 
1910-1960, the decennial increase in the average population of a con- 


зз “The Apportionment Problem and the Site of the House; A return to Webster,” Cornell Law 
Quarterly, vol. 35 (1950), pp. 367-89; and “Last Words on the Apportionment Problem,” Lov and 
Contemporary Problems, vol. 17 (1952), рр. 290301. — * a 


q 


APPORTIONING SEATS IN THE HOUSE OF REPRESENTATIVES 695 


gressional district is about 44,000. If the size of the House had been 
reduced by ten members after the last census, that increase would 
have been about 49,000, a difference probably no member would worry 
about. e 

If such an amendment should be adopted the business of the House 
might be done faster and better and debating the amendment, even if 
Congress took no action on it, would bring home to it the fact that it 
can now change the size of the House slowly up or down without being 
blocked as it often was in the past by a tiny pressure group. 


THE KINSEY REPORT ON FEMALES* 


Ровотнү 8. BRADY 
Washington, Ю.С. 


НЕ first chapter of “Sexual Behavior in the Human Female" con- 
ipe four pages of persuasion in three sections— "The Scientific 
Objective,” “The Right to Investigate” and “The Individual’s Right 
to Know.” The argument begins in the first section with three cautiously 
phrased sentences— 


It should be clearly understood that the original goal of our study was 
the extension of our knowledge in an area in which scientific information 
appeared to be limited. In the course of the years it has become apparent 
that the data we have acquired may prove of value in the consideration of 
some of our social problems, but that was not why we originally began this 
research. 

It has been the history of science that any addition to our store of ade- 
quately established knowledge may ultimately contribute to man’s mastery 
of the material universe. (p. 7) 


progresses to the emotional level in the second section— 


The scientist who observes and describes the reality is attacked as an 
enemy of the faith, and his acceptance of human limitations in modifying 
that reality is condemned as scientific materialism. But we believe that an 

| increased understanding of the biologic and psychologie and social factors 
which account for each type of sexual activity may contribute to an ulti- 

‘mate adjustment between man’s sexual nature and the needs of the total 
social organization. (p. 10) 


and ends in the third section with a dramatic defense of scientific free- 
dom and individual rights— 


‚.. We believe that the scientist who obtains his right to investigate from 
the citizens at large, is under obligation to make his findings available to all 
“who can utilize his data. Any scienJist who fails to report or to place his 
findings in channels where they may serve the maximum number of persons, 
fails to recognize the sources of his right to investigate and thereby jeopar- 
dizes the rights of all scientists to investigate in any field. . . . 

. .. We believe that if we have any right to investigate in this field, we are 
under obligation to make the results of our investigations available to all 
who can read and understand and utilize our data. (pp. 10, 11) 


The obligation to make the findings available to the maximum num- 
ber of persons was fulfilled with all the craft and skill of modern pub- 
licity. Yet the message conveyed to the public is not at all unequivocal. 

* A review article on Secual Behavior in ts Humar: Female, by Alfred C. Kinsey, Wardell B. Pom: 


eroy, Clyde E. Martin, and Paul H. Geb! i x aeg ‘ders Company; 
1953. Рр. 842; $8.00. , hard. Philadelphia and London: W. B. Saunders 


696 


THE KINSEY REPORT ON FEMALES 697 - 


As a sociologist put it, “Even the careful reader, trying to avoid selec- 
tive bias, is not always sure where these investigators stand. They fre- 
quently hint at adverse evaluations of our religious-moral traditions 
but draw back in the end, leaving evaluative issues for future study. 
This somewhat confused situation is providing a field day for moralists. 
Both outraged traditionalists, scanning these books for purple passages, 
and opponents of réligion, eager to prove the validity of their pre- 
conceptions, are finding what they want to find." 

The issues surrounding the individual's right to know the results of 
publicly sponsored investigations should be considered apart from the 
subject of the research. The social control of scientific research ulti- 
mately relies on professional codes of practice. Neither naiveté nor 
indifference but plain common sense on the part of the general public 
and their representatives in legal office or on boards of foundations 
delegates the responsibility for research to the scientists as a group. 
There are not many important subjects of research that can be pre- 
sented so that all men, women and children could pretend to “read, 
understand and utilize the data.” Medicine is replete with examples of 
research on subjects of such grave importance to the general public 
that progress towards scientific findings makes the headlines of daily 
newspapers. Yet the sifting of contradictory evidence, the synthesis 
of generalizations, and the validity of applications arè trustingly left 
to the medical profession. Kinsey's interpretation of the code of the 
scientist, У 

... to investigate honestly, to observe and to record without prejudice, 
to observe as adequately as human sense*organs or the most modern instru- 
ments may allow, to observe persistently and sufficiently in order that there 
may be an ultimate understanding of the basic flature of the matter which 
is involved. (p. 10) k E 


stops short of the principle whigh Safeguards the public interegt in 
scientific research. 

Modern science and its technological applications grew out of the 
concept of an experiment that can be repeated by different observers. 
Historians tell us that this idea in its time was revolutionary enough to 
create great public interest. The assurance that the results of an ex- 
periment will be repeated again and again is fundamental in the appli- 
cation of the results of scientific research. The results of an experiment 
are quantitative generalizations. The mere recording of observations 
has seldom led to a confit petwapn the observers ала fho Tegal 9. 


1 Claude C. Bowman, “Sociological Impiftations df the Kinsey Studies,” Temple University, paper 
Presented to the Eastern Sdtiological Society, April 4, 1954. 


698 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


religious authority. The conflicts have centered around the quantita- 
tive generalizations that constituted attacks on whole systems of ideas. 
The scientists with whom Kinsey compares himself—Kepler, Coperni- 
cus, Galileo, and Pascal—were not primarily observers; they generalized 
astronomical observations that had been accumulating for thousands 
of years. ; 

Astronomy in the nineteenth century met a single historical event, 
the problem of the observation, that was difficult, if not impossible to 
replicate. Towards the end of the century as the mass of scientific data 
and generalizations became large and complex and empirical methods 
were extended to the biological and social sciences, the concept of repli- 
cation had to be given a new operational definition equivalent to its 
literal meaning. 

Modern statistieal procedures give the answer to a serious question. 
What can be done to assure the validity of the results of observations 
that, literally, cannot be repeated? It is no historical accident that Karl 
Pearson, one of the founders of modern statistics, tried to reformulate 
the nineteenth century concept of the repeated experiment in his 
Grammar of Science. It is no historical accident that his son, Egon 
Pearson, among others, brought statistical procedures to the testing of 
hypotheses. Statistical theory offers means for appraising the results 
of investigations that serve the function that the actual repetitions of 
the experiment generally served in the past and still serve in many 
provinces of scientific research. A profession recognizes the work of one 
of its members after scrutiny of all the operational phases of the inves- 
tigation. Statistical methods are involved in the appraisal at the pri- 
mary level of sampling and at the level of quantitative generalization. 

The specification of basic definitions and concepts is not within the 
province of statistical methods. It is the random or systematic vari- 
ability in observations of particular phenomena or in reporting оп 
particular forms of behavior that the statistician is equipped to study. 
The formulation of the operating hypotheses on the basis of which 
experiments or surveys are designed is also not a function of the 
statistician. Statistical procedures implement a principle expressed . 
nearly a century ago—" Wrong hypotheses, rightly worked, have pro- 
duced more useful results than unguided observations."? 

Kinsey's haste to present his findings to all those who might want to 
apply them has forced into the popular press a process usually con- 
ducted quietly through technical journals and conferences. The public . 
may be diverted by all the controversies over the validity and impo! 


+ DeMorgan, “A Budget of Paradoxes,” Open Court Edition, p. 87. ^ 


‚ „к 


THE KINSEY REPORT ON FEMALES 699 


tance of the Kinsey findings but large groups of specialists in the theo- 
retical and applied fields concerned with the subject matter are under- 
standably bewildered. 

The use of statistical survey methods in the study of sex behavior 
did not originate with the Kinsey research. Kinsey's volumes refer to 
earlier studies and Appendix B of the report of the American Statistical 
Association's committee? summarizes important differences between 
methods used in the earlier surveys and in the Kinsey survey. The 
data from a survey in a related field, “Social and Psychological Factors 
Affecting Fertility,"* which was conducted in Indianapolis in 1941, and 
was also financed by a foundation, have been carefully analyzed in 
more than twenty technieal articles by demographers and sociologists. 
Many of the original hypotheses formulated for testing with the data 
have been rejected. The final summaries present the results of all the 
statistical tests. The data from the Indianapolis survey and from the 
other surveys of sex behavior do not display the neat and systematic 
differences or similarities between population groups that Kinsey and 
his associates have discovered in their study of the human female. 

The Kinsey group places great emphasis on its interviewing tech- 
niques and approach to the respondents. The evaluation of these tech- 
niques in the report of the Committee of the Association and in Wallis' 
review® do not need to be summarized in'this connection, All of the 
questions about the sample of males also relate to the sample of femfiles 
which is, 4... even more inadequate than our sample of males in 
representing lower educational levels, rural groups, and some of the 
other segments of the population" (p. 57). Yet the relationships shown 
in the analysis of female sex behavior are uniform and regular com- 
pared to the results from most large-scale surveys of much less variable 
types of human behavior. Other investigations, even with lenger and 
much more intensive statistical analysis, do not produce such elegant 
statistical results. 5 

Here lies the paradox. What is the Kinsey secret that, from an in- 
adequate sample, produces results of a stability that others cannot 
duplicate? Even if the original reports by individuals interviewed were 
absolutely accurafe the mathematical uniformities in the correlations 


з Appendix B of the report of a committee appointed in 1950 by B. 8. Wilks as President of the 
American Statistical Association to review the statistical methods used by Alfred C. Kinsey, Wardell B. 
Pomeroy, and Суйе E. Martin in “Sexual Behavior in the Human Male" (Philadelphia, W. B. Saund- 
ers Co., 1948), to be published in a monograph by the American Statistical Association in 1954. 

4 Clyde V. Kiser and Р. K. Whelpton, “Résumé of the Indianapolis Study of Social and Psycho- 
logical Factors Affecting Fertility,” Po: ion Studies, Vol. 7, No. 2, November 1959, Great Britain. 

__ * W. Allen Wallis, *Statjstics of the Kineey Repost,” Journal of the American Statistical Associa- 
tion, 44 (1949), pp. 463-84e 


700 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


would be surprising. When there is a significant difference in sex be- 
havior between groups differentiated by some social characteristic there 
are few cases of inversions in the rankings. When there is no correlation 
the absence of a significant relation can scarcely be questioned. 

Most social surveys offer & multiplicity of hypotheses about the con- 
nection between a particular form of human behavior and the demo- 
graphic, social, and economie characteristics of various population 
groups. The choice of the hypotheses to be tested, evaluated, and pre- 
sented to the general public more and more has to be made by the 
investigating agency with such advice from specialists in the subject 
matter and in statistical methods as it can marshall on the problem of 
selection. It is doubtful whether the best hypotheses in a social survey 
would ever yield as many unambiguous comparisons as the Kinsey 
study of the human female. Certainly those hypotheses—and they are 
not necessarily the best—that lead to published data do not reach the 
Kinsey standard, 

The particular detail in the Kinsey tables that does not conform to 
experience with other empirical data is the virtual absence of zero fre- 
quencies in the incidence tables that are published. When frequencies 
of а particular form of behavior run low—under 15 per cent of the 
population group—other empirical studies often show a zero frequency 
in the sample réports for subgroups of the population even when the 
number reporting in that subgroup is substantial—say between 50 and 
500. The statistical reason for a wide variation in the percentages and 
averages in samples so large is to be sought in the theory of multivariate 
distributions. Most social and economic characteristics of the popula- 
tion are highly intercorrelated. The accidents of sampling may result 
in a subgroup defined by one or two factors that differs widely from the 
a population represented by that subgroup with respect to other 

‘actors, * 

The uniformities of the Kinsey correlations and the acknowledged 
. variability in sexual behavior lead to only one conclusion. Most of the 
Kinsey findings must be based on a few, relatively incontrovertible 
relations. The apparent precision of his results must be based on some- 
_ thing that is effectively a tautology. 

When the factors associated with a particular form of behavior are 
correlated with each other, a single simple and perhaps logical relation 
can be reproduced indefinitely through one-factor correlations that 
appear to be highly significant. The Kinsey report on the human female 
finds a systematic inverse connection between sexual behavior of fe- 
males and religious activity and a lack of relatior with educational 


THE KINSEY REPORT ON FEMALES 701. 


attainments which were found significant in the study of males. Both 
of these results may be merely reflections of a correlation of age and 
marital status with religion and education among the women inter- 
viewed by Kinsey and his associates, 

The report on the human female provides in the first chapter some 
information on the basic sample not offered in the volume on the hu- 
man male. With the information on the distribution of the sample by 
religious groups and age at the time of interview and the cumulated 
tables in the rest of the volume, it is possible to trace some of the inter- 
correlation of the religious classification and age and—less completely— 
the connection between religious activity and marital status. Of the 
devout females, Protestant, Catholic, or Jewish, relatively more were 
under 25 years of age than of the moderately active or inactive. The 
proportion of devout Protestant females in the age bracket 16-20, was 
32 per cent and approximately twice the percentage of inactive Protes- 
tants of the same age. Among devout Catholic and Jewish females there 
were 33 and 52 per cent in that age bracket; among inactive 19 and 39 
per cent. The difference in the age distributions among the groups 
classified by the degree of religious adherence means that more older 
females were reporting on their activities in the younger ages among 
those inactive religiously than among the devout. The average age of 
those reporting on activity at a given age differs anfong the groups 
classified by religious background much as their reported sexual actifity 
differs. ‹ 

The cumulated tables reveal that there were more females married 
in the younger age groups among those who were inactive than among 
those who were moderate or devout in their religious activities. With 
some difficulty it is possible to read another difference among the re- 
ligious groups from the tables presented in this volume. Relatively 
more previously married females appeared among the inactive than 
among the moderate or devout réligious groups. 

All of the relations between religious affiliation and sexual behavior 
may be a reflection of more fundamental association between age, 
marital status, and the frequencies of various forms of sexual experi- 
ence reflected in the Kinsey samples. The connection between sexual 
experience and marital status in the case of females can hardly be 
debated and Kinsey’s results can easily be confirmed by the man on 
the street who has read the Bible and observed the social practices in 
his own time. The connection with ake probably can not be explained 
as a cultural tautology. * x Y 

The report on the human fémale Shows changes in the female be- 


702 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


havior occurring in this century in the nineteen-twenties that have 
persisted until the present time. The relation between sexual behavior 
and decade of birth, and by inference to age, brings back that very 
inconvenient problem of the statistical sgmple. The Kinsey sample of 
yfemales was constituted mainly of women who had been to college, at 
least for a few years, and whose parents had the capacity to finance 
girls as well as boys through college. The girls who went to college 
prior to World War I must have differed more from the general popula- 
tion of females than those whose parents were able to support this par- 
ticular luxury during the twenties. The Kinsey relation between decade 
of birth and female sexual behavior may be only a reflection of the more 
representative selection of the total female population that entered 
colleges and universities after the close of World War I. The popular 
literature of that decade stressed the appearance of the girl in search 
of a husband on the college campus. If, prior to the war, the woman 
who went to college was selected by her own drive towards a profes- 
sional activity or attainment, her behavior in other respects may have 
been very atypical. 

The Kinsey report on the female comes to a sweeping conclusion 
about the educated women still single in their older years: 

When such frustrated or sexually unresponsive, unmarried females at- 
tempt to direct the behavior of other persons, they may do considerable 
damage. There were grade school, high school, and college teachers among 
these unrespoasive or unresponding females. Some of them had been di- 
rectors of organizations for youth, some of them had been directors of insti- 
tutions for girls or older women, many of them had been active in women’s 
clubs and service organizatiens, and not a few of them had had a part in 
establishing public policies, Some of them had been responsible for some of 
the more extreme sex laws which state legislatures had passed. Not a few of 
them were active in religious work, directing the sexual education and trying 
to direct the sexual behavior of other persons. Some of them were medically 
trained, but as physicians they were still shocked to learn of the sexual ac- 
tivities of even their average patients. If it were realized that something 
between a third and a half of the unmarried females over twenty years of 
age have never had a completed sexual experience, parents and particularly 


„ the males in the population might debate the wisdom of making such women 
responsible for the guidance of youth. (p. 526) 


In all there were 299 unmarried females 36 years of age or older and of 
these, 112 were 46 or older. 

With so many of them teachers, directors of institutions, and medi- 
cally trained as the text of this report suggests, it is possible to reach 
only one conclusion. These were the women of an earlier era dedicated 
to a professional life and—perhaps of more importance in the present 


£a. 


THE KINSEY REPORT ON FEMALES 703 


context—were among those who “covered up” their reports on experi- 
ence. The struggle of women for professional recognition in many fields 
is over. The particular type of determined individual who entered pro- 
fessional training before 1918 and may have been represented by the 
few that came into Kinsey’s sample can not be considered representa- 
tive of women of the same age in the years to come. 

Whether the relation between age, decade of birth, and sexual be- 
havior relates only to Kinsey’s sample or to a more general group of 
the female population, differences in age and marital status explain 
most of the correlations found between sexual behavior and religious 
activity. In other words, it appears that sexual activity explains re- 
ligious inactivity. The smoothness of the Kinsey correlations relating 
sexual activity inversely to religious activity is apparently the regu- 
larity imposed by dividing a range of a variable into three parts, low, 
medium, and high, 

Had the authors standardized their comparisons for age and marital 
status, and had marital status been defined in four groups to include 
the status of “being engaged,” all of the generalizations in this volume 
about the connection between sexual behavior and social factors might 
have been changed. 

Kinsey and his associates apparently started with a fairly simple 
hypothesis about the sexual behavior of hurhan females*—not too well 
formalized to be sure—that it has an anatomical and physiologiéal 
similarity with the males, but is inhibited by social, degal, and other 
cultural codes and practices, 

Through the series of tables describing the sexual activities of the 
religious groups of females there is a causal chain thgt deflected Kinsey 
and his associates from their initial and relatively simple operating 
hypothesis, The greater sexual activity,of the religiously inaetive fe- 
males appears to lead inevitably to a higher proportion of broken mar- 
Tages. The Kinsey inference that greater sexual experience for the 
female before marriage leads to quicker and more satisfactory sexual 
marital relations is contradicted by two correlations in his survey. 
Pre-marital hetero-sexual experience is correlated with extra-marital 
experience. The grotips that have the highest incidence of extra-marital 
experience include the greatest number of marriages broken by sepa- 
ration and divorce. 

The authors contend with these inferences from their study in vari- 
ous oblique ways. Early in the text they state: “The generalizations 
throughout the present volum® havé therefore been restricted to the 
Particular sample that we have*had 4vailable.” Progressively to the 


704 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


final chapters they throw away this restriction. The book would make 
an excellent subject for textual analysis by the techniques of historians 
and Biblical scholars. 


There are some who have feared that а scientific approach to the prob- 
lems of sex might threaten the existence of the marital institution. There 
are some who advocate the perpetuation of our ignorance because they fear 
that science will undermine the mystical concepts that they have substituted 
for reality, But there appear to be more persons who believe that an exten- 
sion of our knowledge may contribute to the establishment of better mar- 
riages. (p. 13) 

There are legal and social responsibilities in any marriage; there are 
economic problems to be solved; above all, there are psychologic adjust- 
ments to be made between the wedded partners. Sexual adjustments repre- 
sent only one aspect and not necessarily the most important aspect of mar- 
riage. No balanced program for American youth can be confined to prepar- 
ing them for sexual relationships in marriage. But it is inconceivable that 
anyone who is objectively and scientifically interested in successful mar- 
riages should fail to appreciate the significance of coitus in marriage, or 
wholly ignore the correlations which exist between pre-marital activities and 
the sexual adjustments which are made in marriage. (p. 391) 

These correlations between pre-marital and extra-marital experience may 
have depended in part upon a selective factor: the females who were inclined 
to accept coitus before marriage may have been the ones who were more 
inclined to accept non-marital coitus after marriage. A causal relationship 
may also have been involved, for it is not impossible that non-marital coital 

„experience before marriage had persuaded those females that non-marital 
coitus might be acceptable after marriage. (pp. 427—428) 

Extra-maritdl coitus had figured as a factor in the divorces of a fair 
number of the females and males in our histories. We have data on 907 indi- 
viduals (female and male) who had had extra-marital experience and whose 
marriages had been terminated by divorce. We have the subjects’ judgments, 
of the significancé of their extra-marital coitus in 415 cases. In nearly two- 
thirds (61 per cent) of these cases, the subject did not believe that his or her 
own extra-marital activity had been any factor in leading to that divorce. 
... It is to be noted, however, that these were the subjects’ own estimates 
of the significance and, as clinicians well know, it is not unlikely that the 
extra-marital experience had contributed to the divorces in more ways and 
to a greater extent than the subjects themselves realized. (p. 435) 

These data once again emphasize the fact that the reconciliation of the 
married individual’s desire for coitus with a variety of sexual partners, and 
the maintenance of a stable marriage, presents a problem which has not been 
satisfactorily resolved in our culture. It is not likely to be resolved until man 
moves more completely away from his mammalian ancestry. (p. 436) 

The failure to recognize these differences in the needs of the two sexes for 
a regular sexual outlet may be the source of a considerable amount of diffi- 
culty in marriage. It is the source of many social disturbances over questions 
of sex. In establishing sex laws, in considering the sexual needs of females 
and males in penal and other ifistitutions, in considering the need among 


ee 


4 


TE 7 


THE KINSEY REPORT ON FEMALES 705 


females and among males for non-marital sources of sexual outlet, and in 
various other social problems, we cannot reach final solutions unless we 
comprehend these considerable differences between the sexual needs of the 
average female and the average male. (p. 682) 

The possibility of reconcilipg the different sexual interests and capacities 
of females and males, the possibility of working out sexual adjustments in 
marriage, and the possibility of adjusting social concepts to allow for these 
differences between females &nd males, will depend upon our willingness to 
accept the realities which the available data seem to indicate. (pp. 688-689) 


The reality the authors found, perhaps reluctantly, in the last chap- 
ter, namely, a fundamental difference between males and females, is 
supported by information from the survey that shows “the male’s 
greater inclination to be promiscuous” (p. 683). The table shows the 
number of partners reported by males and females in pre-marital pet- 
ting and pre-marital coitus. The difference in average numbers of 
partners in pre-marital coitus reported by females and by males is so 
great that, in view of the fact that the numbers of males and females 
in the population are nearly equal, only one conclusion can be reached: 
that the samples of males and females came from different populations. 
This possibility is admitted in the first chapter in a discussion of the 
sample of females. The discrepancy in the data on this subject, sum- 
marized differently, is explained on page 79 in fine print. Reasons 
given, among others, are differences in the distribution ef the samples 
by educational attainment, the omission of prostitutes from this fe- 
port, the fact that some of the men reported on experience abroad while 
in the armed forces, and the possibility that “the females may have 
covered up in reporting their pre-marital experience, or the males may 
have exaggerated their reports of such experience." , 

The inferences in the last chapters represent the confusion of investi- 
gators not equipped with technical tools, without much expemence in 
the analysis of multivariate relations, At some point in the analysis 
of their data for females, the authofs were forced to reject their original 
hypothesis. Without a specific formulation of the quantitative relations 
being tested, they may not have recognized what happened. The im- 
Portance of social and psychological factors as developed mainly in the 
final chapters and references in earlier chapters could easily have been 
Interlarded. 

„Тһе book leaves the very strong impression that even Kinsey and 
his associates would not replicate the generalizations in this volume. 
The contradictions in their inference will serve them well. Several 
other systems of generalizatiofis сапе shown to be consistent with 
Some of the pronouncements in this volume. 


UNSOLVED PROBLEMS OF EXPERIMENTAL STATISTICS* 


Jonn W. TUKEY 
Princeton University 


v WOULD not be misleading to suggest ‘that there is really only one 
I unsolved problem of experimental statistics: «How can we recognize 
the problems of experimental statistics?" We can recognize a good 
many unsolved problems by accident, but we probably miss many im- 
portant ones for far too many years. Difficulties in identifying prob- 
lems have delayed statistics far more than difficulties in solving prob- 
lems. This seems likely to be the case in the future, too. 

Thus it is appropriate to be as systematic as we can about unsolved 
problems. Any system may be a start toward, or even a partial solution 
of, this problem of recognition. I shall try to do this by stating first 
some principles and then some consequences. I shall strive to phrase 
all these principles as generally as possible, in the hope of prolonging 
their useful life. 

‘A discussion of examples of these 18 general principles will set forth 
a certain number of unsolved problems, while a list of 51 provocative 
questions poses many more. (This list is admittedly and intentionally 
incomplete.) The account closes with a discussion of the possibility of 
orienting experimental statistics toward problems rather than tech- 
niques. 

* . BOME GENERAL PRINCIPLES 


If we feel that the detailed problems of experimental statistics arise 
from the interaction of certain'general principles among themselves and 
with classes of experiments, it is reasonable to try to state and illustrate 
some of these principles. Before stating the hypergeneral principles on 
which these general principles "hang, we need to explain the sense in 
which three terms, ends, areas and considerations will be used there and 
in the sequel. i 

„Ву an end we refer to real purposes of the user of the statistical tech- 
nique. These purposes are often unformulated, and their partial formu- 
lation often requires the statistician to “psychoanalyze” his client Gn 
the writer’s view this is one of the most important functions of the 
statistical consultant!). An ?mmediate end is a formalized (and almost 
certainly partial) end such as to describe an appearance (e.g, by 9 
point estimate), to make a test of significance, to make ‘a decision, ОГ 
to reach a confidence statement... 


* Prepared in connection with research У : 
mue sponsored Dy the Office of Naval Research. Presented to 
the American Statistical Association and the Biometric Society 28 Pc 1953. 


706 


UNSOLVED PROBLEMS OF EXPERIMENTAL STATISTICS 707 


An area is a class of situations with qualitatively similar data, such, 
for example, as the class where two sets of observations are presented 
for the comparison of the “typical” values of the corresponding popu- 
lations (means, medians, and the like serve as “typical” values). Within 
an area, different techniques are competitive. Within an area, the his- 
torical, evolutionary, and logical relations of different techniques are 
relatively clear. б 

А consideration is recognition that the world may very well be more 
complex, annoying, and difficult than our earlier techniques had sup- 
posed. Thus we might admit—nay, even take into consideration—the 
possibility that we did not know the variance, that the distribution 
might not be normal, that a certain fraction of the observations are 
affected by blunders, etc. 

The four hypergeneral principles, which may seem harmless until 
we come to their consequences, run as follows: 


(A) Different ends require different means and different logical 
structures. 

(B) In each area, statistical method must and does evolve, mainly 
by adding both immediate ends and considerations. 

(C) While techniques are important in experimental statistics, 
knowing when to use them and why to use them are more 
important. e 

(D) In the long run, it does not pay a statistician to fool either him- 
self or his clients. 


We have one hypergeneral principle about logical structure, two about 
statistical method, and one about statisticians. 'The last may seem to 
be of smallest scope, but when we consider mátters carefully, we see 
that (А), (B), and (C) all follow from #2). To insist on one fheans or 
one logical structure for different ends; or to feel that there isa solution 
to the problems of method, are Óbvious attempts of the statistician 
to fool himself. 

Olearly, one very general consequence is this: “This complexity of 
experimental statistics will clearly increase." * 

Reducing the generality somewhat, we list some consequences of 
(A), (B), (C), and (D) which are themselves general principles: 


(A1) Statistics needs constantly to recognize new ends for which it 
should try to furnish new means and new logical structures. 

(42) Statistics needs to avid over-unification, while encouraging 
coordinations e s 


708 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


(АЗ) Statistical methods should be tailored to the real needs of the 
user. 

(A4) Statistics needs continually to compare its own logical struc- 
tures with the logical structures currently used or being put 
into use by science, engineering, business, and military ad- 
ministration, and other fields. 

(B1) In any area of statistical method, analysis cannot be usefully 
considered alone for more than a limited time; after a time 
appropriate to the area, design must be brought in. 

(B2) There are normal sequences (patterns) of growth in immediate 


tions. : 

(B4) Growth in immediate ends can sometimes be neglected, but 
growth in considerations is almost never to be neglected. 

(B5) At any one time, different areas of statistical methodology will 
be in different states of evolution, both in immediate ends and 
in considerations. 

(C1) Competitive statistical techniques indicate a need for manuals 
of “when to choose which" and not just selection of “the best” 
technique. 

(C2) Statisticians owe their clients help in choosing wisely between 
high confidence in a short inference and low confidence in a long 
inference. 

(C8) Techniques of evaluating both the isolated experiment and 
history down to daté will continue to be useful. 

(C4) “What should be done” is almost always more important than 
“what can be done exactly.” Hence new developments in ex- 
perimental statistics are more likely to come in the form of . 
approximate methods thar! in the form of exact ones. | 

(D1) Statisticians must face up to the existence and varying im- 
portance of systematic errors. 

(D2) Statisticians have an obligation to clarify the foundations of 

their techniques for their clients. 

(D8) Statisticians should be honest and expository about the relation 
of precise “assumptions” and exactly "optimum" solutions (0 
real situations. 

(D4) In every statistical area, we almost certainly need methods 
admitting one more nuisance parameter, methods of one higher 
level of robustness and dé-parametrization, methods wit 
both of these desiderata. à 


ends. 

(B3) There are normal sequences (patterns) of growth in considera- 
D 

| 


gue 


UNSOLVED PROBLEMS OF EXPERIMENTAL STATISTICS 709 


(D5) Statistics must continually study the behavior of its techniques 
when their conventional assumptions are not true. 


ILLUSTRATIVE EXAMPLES 


I will try to illustrate these principles by discussing particular prob- 
lems of experimental statistics which show their impact. These exam- 
ples are not intended to be an exhaustive list. In the light of general 
principle (C), a problem in experimental statistics is not solved by the 
existence of a mathematical statistical paper showing how to find a 
solution, or even by the existence of a technique with tables. There is 
needed an understanding of when and why to use the technique, and 
this understanding must be spread through a certain minimum number, 
sometimes small and sometimes large, of experimental statisticians. 
Thus we may, and should, discuss as unsolved problems some which 
others may consider as already solved. 

(A1) Statistics needs constantly to recognize new ends for which it 
should try to furnish new means and new logical structures. A very good 
illustration of this principle is provided by recent developments in con- 
nection with the problem of multiple comparisons. Where one immedi- 
ate end grew a few years ago, three immediate ends flourish today and 
promise to flourish for a long time. These three are: 


(1) The immediate end of providing increments fo the store of 
established knowledge. This to be done by the analysis of existent 
data with control of the error rate. The analysis to be formulated 
in confidence or significance statements (cf. Tukey [35, 36, 37], 
Duncan [11, 12, 13] and others). * 

(2) The immediate end of providing protection against too bad a 
selection among candidates. This to be done by a sequential de- 
sign of measurement. The resutt to be selection of the appar- 
ently leading candidate when'the “stop rule” takes effect. (cf. 
Bechhofer, Dunnett, Sobel [1, 2, 14). 

(8) The immediate end of minimizing, in some sense, the sum of the 
costs of experimentation and the costs of poor choice. This is to 
be done by g sequential design of measurement. The result to be 
selection of the apparently leading candidate when the "stop 
rule" takes effect (cf. Grundy, Healy, and Yates [40, 41], 
Sommerville [81]). 


In my judgment, there will be а continuing place for all three immediate 
ends. То a reasonable extet thee places correspond te the terms 
“basic research,” “developmental reséarch,” and “operations research,” 
[ер. 22]. * 


710 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


This problem of multiple comparisons is still unsolved as a problem 
of multiple comparisons, because the necessary minimum numbers of 
experimental statisticians have not yet acquired a working understand- 
ing of the new immediate ends involved, or of when which technique 
is appropriate. Analogous problems, involving immediate ends which 
differ in analogous ways, are to be expected in more areas of statistics. 

(A2) Statistics needs to avoid over-untfication, while encouraging co- 
ordination. It is now known to mathematical statisticians that all the 
currently routine modes of statistical technique—significance state- 
ments, point estimates, confidence statements, etc.—can be formulated 
as decision problems. There is a tendency in the air to do so to an in- 
creasing degree. This may be good mathematical statistics, because it 
may encourage the interchange of useful mathematical techniques 
among the modes. (We are likely to see in due course whether or not 
this is true.) But it would surely be very bad experimental statistics to 
treat all these modes in too unified a way. For then some experimental 
statisticians might be led to forget whether their clients wanted (ex- 
plicitly or implicitly) a decision or a confidence statement, whether 
they had done the experiment as a basis for immediate action or as à 
contribution to knowledge. What more important matter could be for- 
gotten by any experimental statistician? 

In almost evéry area of experimental statistics, there is a problem of 
providing enough different methods to meet the user’s needs. 

(A8) Statistical’methods should be tailored to the real needs of the user. 
Ina number of cases, statisticians have led themselves astray by choos- 
ing a problem which they could solve exactly but which was far from 
the needs of their clients. They could have chosen a problem closer to 
their client's needs at tle price of an approximate solution. In most of 
these cases, tailoring the statistical method to the real needs of the 
client would have meant, and still means, giving up exactness for the 
sake of usefulness. Realistic assessment of value must urge us to make 
such “deals” freely and frequently. 

The broadest class of such cases comes from the choice of significance 
procedures rather than confidence procedures. It is often much easier 
to be “exact” about significance procedures than about confidence рго- 
cedures, By considering only the most null “null hypothesis” many in- 
convenient possibilities can be avoided. If the varieties are not different 
they cannot interact with fertilizers or blocks. If the treatment has no 
effect, we do not have to be concerned with how its effect varies with 
the weight or health of the animal or child. And so on—and on. In these 
examples, it will be clear to many that we are dodgifig substantial issues. 


+ 


Z 
UNSOLVED PROBLEMS OF EXPERIMENTAL STATISTICS ТАЯ. 


But throughout experimental statistics there are many areas with sig- 
nificance procedures but without confidence procedures. Almost every 
one of these areas needs one or more rough confidence procedures; 
Rough procedures will be adequate because the assumptions are not 
likely to be closely true, so that the probability statements need not 
follow precisely from the assumptions either. One or more, because 
techniques based on,alternative assumptions give both greater freedom 
of action and greater confidence in results to the analytical statistician. 
Here are many unsolved problems in experimental statistics! 

At another level of unsolution are the problems where the approxi- 
mate mathematical statistics has been done, but no use has been made 
of the results. One outstanding example is the computation by Haldane 
[19] of the effect of non-normality on the variance of the estimated cor- 
relation coefficient. Who has put this to use? Yet it surely is enough to 
support an empirical robustification procedure involving an effective 
number of pairs of observations. There must be many more examples 
like this, where the results have not been carried through to practical 
usability. 

(A4) Statistics needs continually to compare its own logical structures 
with the logical structures currently used or being put into use by science, 
engineering, business, and military administration, and other fields. 
We can indicate an unsolved problem hefe which is aot likely to be 
solved in the near future. This is the problem of formalizing some 
further part of the process of developing new scientific concepts and 
new scientific theories. Only the most elementary steps in this process 
have been formalized (in terms of the analysis of conventional types 
of experiments, of the testing of goodness of fit, ang the like). Undoubt- 
edly some, at least, of the less elementary steps can be formalized, but 
how? And which ones? е б 

This is a vague and diffuse problem, but it is a very important prob- 
lem indeed. Some would construe it as a problem for philosophers, but 
I feel that it will require quantitative philosophers (thatis, experimental 
statisticians). 

(B1) In any area of statistical method, analysis cannot be usefully 
considered alone fer more than a limited time; after a time appropriate 
to the area, design must be brought in. The second and third types of 
multiple comparison procedures cited above (41) furnish an excellent 
example of the need for design. For the immediate ends involved the 
only action, once the measurements are made, is to take the seemingly 
best candidate. That this is*reasoftable is, and has been,»clear to all. 
Even a very moderate degree bf sophistication was barred from these 


712 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


situations until the question of when to stop taking measurements was 
introduced. There must now be many similar cases in other areas today 
where design considerations have not yet been properly introduced. 

(B2) There are normal sequences of growth in immediate ends, One 
natural sequence of immediate ends follows the sequence: 


(1) Description 

(2) Significance statements 
(3) Estimation 

(4) Confidence statement 
(5) Evaluation 


Tn the case of a double binomial the successive levels are illustrative by 
the sequence of statements. 


(1) The percentage of success observed among A’s was higher than 
among B’s, 

(2) The percentage of success among A’s was significantly greater 

. than among B's. 

(8) The observed percentage of success among A's exceeded that 
among B's by a difference of 0.28 in logits. (Or, perhaps, by 15 
per cent.) 

(4) The difference in logits corresponding to the increased percentage 
of success in A's as against B's is between 0.18 and 0.43 with 

* 95 per cent confidence. (Between 10 per cent and 22 per cent 
with 95 per cent confidence, perhaps.) 

(4) Considering both this experiment, and all the observations re- 
ported by Smith, Jones, Brown, Robinson, and their coworkers, 
the indicated difference in logits lies between 0.32 and 0.36 with 
5 per cent diffideiice (the difference in per cent lies between 17 
and 19, perhaps). © 


The order of (2) and (3) is not nearly sə well defined as that of any other 
pair. In some areas, and to some experimental statisticians either order 
would be wrong. We have chosen this order for definiteness and not 
with sureness, 

In the actual case of the double binomial, almost every experimental 
statistician can handle (1) › (2), and (3) easily. Some are not perturbed 
by (4) and of these most but not all can handle (4) correctly. No one, s0 
far as the writer knows can treat (5) adequately. In other sreas we may 
stop at level (1), at level (2), at level (8), or at level (4), but in almost 
every case there is a next level which repzesents an unsolved problem. 

How to operate at level (5) seems to represent an unsolved pr oblem 


| 
i 


~ ат. 


TTC wwe ү = 


UNSOLVED PROBLEMS OF EXPERIMENTAL STATISTICS 718 


in many areas. It is a real and important problem, and one whose solu- 
tion should not be approached flippantly or lightly. Either the classical 
example of the charge on the electron (as of 1938) or the current exam- 
ple of the heat of sublimation of carbon (which has not improved during 
the last 25 years) shows that the proper evaluatory answer may be: 
“The available determinations fall into two systematically different 
groups, which correspond to values between A and B and between C 
and D, respectively, and which we are confident cannot be brought into 
agreement without the introduetion of a new systematic adjustment," 
How many other unusual (from the point of view of formal statistics 
as found inthe books) kinds of conclusions are reasonable in evaluation 
of all available data? This is not an easy question, but its solution (at 
least its partial solution) is a prerequisite to that of any problem of 
evaluation. 

There are, of course, other normal sequences of immediate ends, 
leading mainly through various decision procedures, which are appro- 
priate to development research and to operations research, just as the 
sequence we have just discussed is appropriate to basic research. (Here. 
“There are, of course” means “There must be! We are sure they exist, 
but we cannot specify them today.”) 

(B3) There are normal sequences of growth in considerations, The area ' 
of comparing the typical values of two popülations with aid of a sample 
drawn from each illustrates a customary sequence of evolution inon- 
siderations quite nicely. The sequence runs: . 


(1) Normal populations of equal and known variance. 
(2) Normal populations of general (ie., probably unequal) and 
known variances. Ё Kia 
(8) Normal populations of identical but unknown (but estimated) 
variance. fe Y 
(4) Normal populations of genetal and unknown (but estimated) 
variances, 
(5) Symmetrical populations of unknown shape and unknown but 
equal variance. " 
(6) Symmetrigal populations of the same unknown shape but gen- 
eral and unknown variances. 
(7) Symmetrical populations of unknown shapes and variances. 
(8) Popylations of unknown but equal shape and variance. 
(9) Populations of the same upknown shape and unknown and 
general variances. e " 5 
(10) Populations of general andeunknown shapes and variances. 


714 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


Here we have exemplified the growth in considerations like these: 


(a) The scale of the populations might be different. 

(b) The variance might not be known. 

(c) The symmetrical populations might not be normal. 
(d) The populations might not have the same shape. 
(e) The populations might not be symmetrical. 


It is by considering such unpleasant possibilities that we sharpen our 
techniques and strengthen our understanding. 

The normal distribution suffices for levels (1) and (2), while level (3) 
requires Student's ¢. The next level, (4), provides the Fisher-Behrens 
problem, while (4) seems to be the likely end of the direct application 
of Wileoxon-Walsh [38-39] procedures (so far only applied to the 
matched observation case). Beyond this point the terra is rather in- 
cognita, but we may note that through level (7) we need to make no 
distinction between medians and means, while simple rank order pro- 
cedures are exact through level (8). 

Not only does this area—and remember that it is one of the most 
carefully worked over of all areas—provide a good example of a normal 
sequence of growth in considerations, but it also provides many exam- 
ples of unsolved problems. The Fisher-Behrens problem arises quite 
early, at only level (4) in the list, yet today the Fisher solution is 
kncon not to be unique [33], even in the domain of fiducial probability, 
while the Aspin-Welch solution may or may not correspond to an exact 
solution as well as an asymptotie one. What should a poor experimental 
statistician do? 

Who has good-looking solutions for the problems posed by (5), (^, 
(9), or (10)? Who knows how the solutions for level (4) just mentioned 
behave ag to error rate when (6), (6), or (7) represents the facts? How 
do the solutions for level (4) behaye as to power when either (4) or (3) 
represents the facts? And the reader can add many more. 

The foreseeable, normal growth in considerations will provide un- 
solved problems for а long time to come in almost every area of statistics. 
‚ (B4) Growth in immediate ends can sometmes be neglected, but growth 
Фп considerations is almost never to be neglected. We^can use the two- 
sample area to illustrate this principle also. If we had a clear and reason- 
able solution to the Fisher-Behrens problem, very few experimental 
statisticians would dare ignore it. But many are content £o teach sig- 
nificance testing without confidence procedures. (The young chemist 
who can analyze the variance of Latin squares and snatch out single 
degrees of freedom with zest and ease, But who cannot use Student's t 


UNSOLVED PROBLEMS OF EXPERIMENTAL STATISTICS 715 


to set confidence limits on A—B, because no one ever mentioned it to 
him, is a poor witness to the teaching of chemists by statisticians!) 

(B5) At any one time, different areas of statistical methodology will be 
in different states of evolution, both in immediate ends and in considera- 
lions. We have only to contrast the two-sample area with the mXn- 
contingency-table area or the correlation-coefficient area with the 
measures-of-nonnormality-for-time-series area to find application of 
this general principle. ` 

(C1) Competitive statistical techniques indicate a need for manuals of 
“when to choose which” and not just selection of “the best” technique. Our 
discussion of the two-sample area should have made it clear that what 
is needed here is a guide to the various techniques explaining why and 
when to use them. No selection of a single “best” technique is going to 
be satisfactory. 

Another widely separated area which illustrates the principle nicely 
is the response maximization area. Here we have a spectrum of sugges- 
tions from the carefully thought-out “circle and bee-line (possibly re- 
peated) and then survey” technique of Box and Wilson [5] to the creep- 
ing technique of Friedman and Savage [16] and the sophisticated but 
so far one-dimensional technique of Robbins and Monro [30]. I am sure 
that all of those named have their place, as do, no doubt, some of the 
intermediate points in the spectrum. I have, indeed, sorae idea of where 
these places are. But I would like to know far more precisely where 
these places are and why. (You couldn’t possibly sell me a single best 
method!) 

(C2) Statisticians owe their clients help in choosing wisely between high 
confidence in a short inference and low confidence in a long inference. In 
the analysis of three and more way analyses of variance, there arises 
the problem of choosing the correct grror term (e.g. Goulden [17]). 
This is the first big problem in the analysis of variance, and one that is 
still very effective in separating the statisticians from the children. If 
one classification is years, one choice can be put into words as follows: 
Will you have differences in average performance averaged over these 
particular years, with narrow confidence limits, or will you have dif- 
ferences in average performance, averaged over a population of years of 
which these years are a sample, with much broader confidence limits. 
With regard to this particular example, most experimental statisticians 
are clear and effective. Thus, it may be a solved problem. But in many 
other areas the corresponding préblem is not only unsolved but 
unposed! m Е ° 

Some have querjed the use of “ghort” and “long” in this context, and 


716 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


have tried to relate this choice to that of the proper “breadth” of 
foundation (the advantages of sufficiently broad basis of inference 
have, of course, been ably discussed by Fisher [15, Section 39]). It is 
important to avoid possible confusion in this regard. Considerations of 
breadth arise during the design of an experiment, while considerations 
of length arise in its interpretation. Thus an experiment to compare 
certain psychological characteristics within brother-sister pairs would 
be broadened as to foundation if changed from 50 pairs drawn from 
Indiana to 5 subgroups of ten pairs each from 5 geographically and 
culturally separated areas. For either experiment, there will be a prob- 
lem of length of inference! Will we make statements about the average 
over the 50 pairs of perfectly measured differences, or shall we make 
statements concerning the average differences in larger populations of 
which these 50 pairs, or these 5 sets of 10 pairs are a sample or samples? 
The two questions are quite separate. 
(C8) Techniques of evaluating both the isolated experiment and history 
. down to date will continue to be useful. There are many experimental 
procedures that involve either the regular measurements of control 
specimens or the regular use of special calibration procedures. After а 
new calibration, should we use the old calibration? Should we use only 
the new calibration? Or should we combine old and new values? With 
what relative weights? This is a recurrent problem, one whose solution 
migiit improve measurement accuracies per dollar in a wide variety of 
applications. But who has the solution? or better “the solutions,” be- 
cause the path is long from the isolated group of occasional measure- 
ments to the production line producing measurements steadily. Differ- 
ent locations along this path will require different solutions. Work on 
this problem has undoubtedly been hampered by the tradition of the 
self-contained experiment. But many measurement procedures are far 
from self-contained experiments. . . 

Like unto this first example is a second. Most procedures of statistical 
analysis today inelude a measure of spread in this particular experi- 
ment, be it an estimated variance, a total or mean range, or the mean 
square in а certain line of the analysis of variance. Usually there is past 
evidence as to the variability in question. In assessilig the results of а 
particular experiment shall we use only the estimate from within the 
experiment? Only past history? Some combination of the two? Which 
combination? : 

This problem of how far to lodk back is widespread and unsolved. 
A solution might allow us to narrow the wide confidence limits that 8° 
with wide apparent variation and to widen the ‘falsely narrow one 


И 


UNSOLVED PROBLEMS OF EXPERIMENTAL STATISTICS 717 


which go with narrow apparent variation. This would equalize our 
exposure to error, and tend to let us make sharper statements on the 
average. Again the philosophy of “each experiment to itself” has stood 
in the way. But why shouldewe allow this to go on? (Of course the phi- 
losophy of “each experiment to itself” is important, of course it must be 
widely used, but neither always or everywhere! Just another example 
of (42) and (C1).) à 

(C4) “What should be done” is almost always more important than 
“what can be done exactly.” Hence new developments in experimental statis- 
tics are more likely to come in the form of approximate methods than in the 
form of exact ones. Once upon a time the calculation of the first four 
moments was an honorable art in statistics. Then came those who could 
calculate the exact distributions of simple expressions. And because 
their results were “exact” they took over the place of honor. (Partly 
too, perhaps, because the moment calculators failed on occasion to 
transform their expressions wisely before calculating the moments.) 
And it came to be infra dig to find moments. In seminars one heard 
A's achievement of calculating the first four moments for n’s up to 12 
belittled in comparison with B’s proof that the distribution tended to 
normality as n tended to infinity. Yet which result was more useful to 
the experimental statistician with experimental data for n equal to 5, 
10, 20 or even 50—? Probably the first four moments. la 

If the moments had been on MacArthur's staff, their parting state- 
ment would have read “we shall return!” But when? I think that itis 
high time to bring the calculation of moments back to that high estate 
which it deserves . We shall always havé'to deal with messy expressions, 
whose exact distribution will be found by no,oné, at least for a long 
time. Moments may allow us to get on with the work. If they do allow 
us to do this, let us use them. F 

The variability of estimates ofrspeotra of time series provides a case 
in point. Even with the normality assumption, the exact distribution is 
not going to be easily manageable. Yet the first two moments can be 
found, and found with very useful results. Considerable recent progress 
e the analysis of physical time series rests on those two moments [e:g. 

, 29]. 

(D1) Statisticians must face up to the existence and varying importance 
of systematic, errors. The failure of the statistician to take sufficient 
cognizance of systematic errors has been in part an escape phenomenon. 
To a man looking hopefully for a way to shorten a confidence interval 
by 7 per cent of its length by ingeniqus devices, the thought of syste- 
matic errors whickemight make it twice as long comes as a severe shock, 


718 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


and all men try to avoid shocks. Perhaps, too, the recent development 
of statistics in connection with the uncomfortable sciences like agri- 
culture and biology—uncomfortable because unsystematic errors tend 
to be so large—may have much to do witli this. Only the sampling sur- 
vey statisticians, with their recent treatment of “non-sampling errors” 
seem to be facing up to the existence of systematic errors. 

What should experimental statistics as a whole do about systematic 
errors? Should we change from “95 per cent confidence” to “5 per cent 
diffidence” and impress on our clients that more diffidence has to be 
added because of systematic errors? Have we been overselling our 
clients on the confidence with which they should accept the results of 
our analyses? Is this why physics is the most-resistant of all the sciences 
to the penetration of statistics? 

Some there will be who will claim that the old ways are good enough, 
since in comparative experiments the systematic errors tend to be very 
much smaller than in absolute experiments. Very much smaller, but 
not zero, is the answer. (The experimental statistician dare not shrink 
from the war ery of the analyst ^Only a fool would use it, but it's 
better than we used to usel,” but on the other hand, he dare not take 
the motto as a permanent excuse for sloppy methods). Here is a real 
unsolved problem of experimental statistics; What about systematic 
errors? 

(D2) Statisticians have an obligation to clarify the foundations of their 
techniques for their clients. I have the impression that, at the time the 
analysis of variance was introduced, the practice of adjusting yields for 
the apparent fertility of blocks was, or would have been, regarded with 
suspicion—“cooking the observations.” Yet the analysis of variance 
which is, quite equivalent in its results, seems to have spread without 
opposition of this sort. Was this because the arithmetic was во compli- 
cated that the poor client didn’t understand what was going on? I am 
sorely afraid that this was the case. 

Atthe beginning, it may have paid the statisticians to fool their clients 
about the analysis of variance, but does it today? I give vent toa 
hearty “no!”, feeling that many clients get far less eut of such analyses 
than they should, because they don’t understand what is going ОП. 
How many of your clients really understand what sorts of additive 
decompositions of the observations underlie the analyses of variance 
you proudly return to them? > 

How to-explain to the client what thé analysis of variance is about? 
This is surely a problem of experimental statistjcs. Even if I sho 
know a large part of the answer, as I hope I do, it is an unsolved prob- 


UNSOLVED PROBLEMS OF EXPERIMENTAL STATISTICS 719 


lem, since the answer is not at the finger tips of enough experimental 
statisticians. 

In how many other areas are we losing by fooling our clients? 

(D3) Statisticians should ke honest and expository about the relation of 
precise “assumptions” and exactly “optimum” solutions to real situations. 
As an example here, let us take a field currently under development. 
Box and his coworkers have been, and continue to be, active in the 
development of designs for the estimation of all the zeroth, first, and 
second degree coefficients in a second degree response surface, where 
the response is a function of 1, 2, 3, 4, 5, etc., variables. In the process 
he is resting heavily on such “exact” concepts as “orthogonailty” and 
“estimating all coefficients with the same variance.” He is well aware 
that, because of the way the designs are to be used, these “exact” math- 
ematical properties are not likely to correspond to any physical reali- 
ties, that, in any particular situation, there is no reason to believe that 
the “exactly optimum” design is appreciably better than any nearby 
design, But even if “exactly optimum” does not mean what it says, it 
may well mean “likely to be quite useful,”, as in this case it does. 

How many of the potential users of such designs will understand that 
“exactly optimum” doesn’t mean what it says? All too few, and for the 
others we statisticians are likely to be tp blame. We have pushed 
“optimum” procedures for one reason or another, without adequate 
warning about idealizations and the real world. As à psychologist once 
said when Mosteller discussed "inefficient statistics" before the Eastern 
Psychological Association, “inefficient statistics, but efficient statisti- 
cians”! How often do we miss the chance to have “non-optimal tech- 
niques, but optimal statisticians ” apply to ug? * 

Another example of the same sort looms large on the horizon. It 
concerns all of bioassay and much of*the transformation of counted 
data (a subject about which thereare Whispers of new discussion). Little 
attention has been paid to gains or losses from “exact” maximum likeli- 
hood, minimum chi-square, or unbiased solutions of bioassay problems. 
Much attention has been spent in getting these “exact” solutions. Does 
it matter whether, we use logits, probits, or anglits? How much does‘it 
matter? (On this there is some information.) What happens if a little 
non-binomial fluctuation creeps in? Have we been realistic about any- 
thing in this whole area? Clearly there are many unsolved problems of 
experimental statistics here. Р 

(D4) In every statistical аңга, wa almost certainly need methods ad- 
mitting one more nuisance parangeter, methods of one higher level of robust- 
ness and de-parametrization, methods with both of these desiderata. Here 


720 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 


we may turn the carpet back to see the dirt—it is a large carpet tryi 
to cover much dirt. We have a reasonably wide variety of procedures” 
for analyzing counted data which assume pure binomial variation, 
Contingency tables, chi-square, and w? goodness of fit tests, Kolm 

goroff-Smirnoff bounds on the population distribution, all-or-none bio- Ta 
assay, and so on. The list is long. Many of the techniques are important. "ў 
All of them need procedures admitting the possibility of additional non- 
binomial variation. We gave up long ago assuming that we knew the 
variance of yield of soy bean plots of given size—even though we had: 
empirical data on it. We blithely assume that we know the variance of — 
preparing a dilution and the variance of death among guinea pigs ing 
jected with a single dilution—we assume one to be zero and the other — 
to be binomial! We would criticize the varietal trial without an internal * 
estimate of error, yet we look silently on the bioassay without one. — 

Perhaps in part we have not attacked these problems because of their 
resemblance to those cited under (C3). Perhaps we have not attacked 
them because their consideration would disturb our clients’ techniques — 
or bring to light new sources of variation. But whatever the reasons; d 
they do not seem valid to me today. } 

Here are many unsolved problems in experimental statistics. 

(D5) Statistics must continually study the behavior of its technique 
whex their conventional assumptions are not true. I have touched on som 
minor examples of this principle. Let me cite a few major ones. 

Many statistical techniques assume homogeneity of variance, each of 
them needs a related technique assuming inhomogeneity of varianc 
How do the present, techniques stand up under homogeneity? 

Many statistical techniques utilize a normality assumption alm 
exclusively as a means for predicting the stability of estimated уа! 
ances. Each needs a related robustified technique which allows for the — 
effects of non-normality on this stability. How do the present tec h- 
niques stand up under non-normality? 

Many discussions of efficiency of estimation assume an under!, 
normal distribution. Each needs related studies assuming suitably 
varied nonnormal distributions. v 

How many unsolved problems do we need? 


PE. 


SOME PROVOCATIVE QUESTIONS 5 
.. In providing examples of the various inci hi 
num 1 general principles, I have Ш 
dicated a nümber of unsolved problems of experimental statistics, 
there are a few more at the tip of the tongue. In this section I shall 


4 


UNSOLVED PROBLEMS OF EXPERIMENTAL STATISTICS 721 


to provide a few more, mostly indirectly, by trying to ask some provoca- 
tive questions. 

(1) What are we trying to do with goodness of fit tests? (Surely not to 
test whether the model fita, exactly, since we know that no model fits 
exactly!) What then? Daes it make sense to lump the effects of syste- 
matic deviations and over-binomial variation? How should we express 
the answers of such a test? 

(2) Why isn’t someone writing a book on one- and two-sample tech- 
niques? (After all, there is a book being written on the straight line!) 
Why does everyone write another general book? (Even 800 pages is 
now insufficient for a complete coverage of standard techniques.) How 
many other areas need independent monograph or book treatment? 

(8) Does anyone know when the correlation coefficient. is useful, as 
opposed to when it is used? If so, why not tell us? What substitutes are 
better for which purposes? 

(4) Why do we test normality? What do we learn? What should we 
learn? 

(5) How soon are we going to develop a well-informed and consistent 
body of opinion on the multiple comparison problem? Can we start soon 
with the immediate end of adding to knowledge? And even agree on 
the place of short cuts? 

(6) How soon are we going to separate regression situations from com- 
parison situations in the analysis of variance? When will we clearly 
distinguish between temperatures and brands, for éxample, as classifi- 
cations? 

(7) What about regression problems& Do we help our clients Ni use 
regression techniques blindly or wisely? What are the natural areas in 
regression? What techniques are appropriate in each? How many. have 
considered the “analyses of variance'* corresponding to taking out the 
regression coefficients in all possible ‘orders? 

(8) What about significance vs. confidence? How many experimental 
statisticians are feeding their clients significance procedures when 
available confidence procedures would be more useful? How many are 
doing the reverse? i 

(9) Who has clarified, or can clarify, the problem of nonorthogonal 
(disproportionate) analysis of variance? What should we be trying to do 
in such a situation? What do the available techniques do? Have we 
allowed the superstition that the individual sums of squares should 
add up to the total sum of gquareg to mislead us? Do we need to find 
new techniques, or to use old ones hetter? 

(10) What of tke analysis of covariance? (There are а few—at least 


722 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 9 


опе [10]—discussions which have been thought about. How many 
experimental statisticians know more than one technique of interpre- 
tation? How many of these know when to use each? What are all the 
reasonable immediate aims of using a covariable or covariables? What: 
techniques correspond to each? э 

(11) What of the analysis of variance for vectors? Should we use overt 
multivariate procedures, or the simpler ones, ones that more closely 
resemble single variable techniques, which depend on the largest de- 
terminantal root? Who has a clear idea of the strength or scope of such 
methods? 

(12) What of the counting problems of nuclear physics? (For some of 
these the physicists have sound asymptotic theory, for others repairs 
are needed—cf. Link [21].) What happens less asymptotically? What 
about the use of transformations? What sort of nuisance parameter is 
appropriate to allow for non-Poisson fluctuations? What about the 
more complex problems? 

(18) What about the use of transformations? Have the pros and cons 
been assembled? Will the swing from significance to confidence increase 
the use of transformations? How accurate does a transformation need 
to be? Accurate in doing what? 

(14) Who has consolidated our knowledge about truncated and censored 
(cf. [18], p. 149) normal distributions so that tt is available? Why not & 
monógraph here that really tells the story? Presumably the techniques 
and insight here are relatively useful, but how and for what? : 

(15) What about range-based methods for more complex situations? (We 
have methods for the analysis ofsingle and double classifications based 
on ranges.) What about methods for more complex designs like bal- 
anced incomplete blocks, higher and fractional factorials, lattices, etc.? 
In which areas would they be quicker and easier? In which areas would 
they lead to deeper insight? ў 

(16) Do the recent active discussions about bioassay indicate ће solution 
or impending solution of any problems? What about logits vs. probits? 
Minimum chi-square vs. maximum likelihood? Less sophisticated 
methods vs. all these? Which methods are safe in the hands of an ex- 
pert? Which in the hands of a novice? Does a prescribed routine with 
а precise “correct answer” have any value as such? 

(17) What about life testing? What models should be considered be- 
tween the exponential distribution and the arbitrary distribution? 
What about. accelerated testing? (Clearly we must use it for long- 
lived items.) To what extent must we rely on actual service use to 
teach us about life performance? А 


UNSOLVED PROBLEMS OF EXPERIMENTAL STATISTICS 723 


(18) How widely should we use angular randomization [4]? What are 
its psychological handicaps and advantages? Dare we use it in explora- 
tory experimentation? What will be its repercussions on the selection 
of spacings? 

(19) How should we seek’ specified sorts of inhomogeneity of variance 
about a regression? What about simple procedures? Can we merely 
regress the squared, deviations from the fitted line on a suitable func- 
tion? (Let us not depend on normality of distribution in any case!) 
What other approaches are helpful? 

(20) How soon can we begin to integrate selection theory? How does the 
classical theory for an infinite population (as reviewed by Cochran 
[8]) fit together with the second immediate aim of multiple comparisons 
(Bechhofer et al. [1, 2, 14]) and with the a priori views of Berkson [8] 
and Brown [6]? What are the essential parameters for the characteriza- 
tion of a specific selection problem? 

(21) What are appropriate logical formulations for item analysis (as 
used in the construction of psychological tests)? (Surely simple signifi- 
cance tests are inappropriate!) Should we use the method introduced 
by Eddington [32, pp. 101-4] to estimate the true distribution of se- 
lectivity? Should we then calculate the optimum cut off point for this 
estimated true distribution? Or what? 

(22) What should we do when the items*are large and correlated? (If, 
for example, we start with 150 measures of personality, and seek to 
find the few most thoroughly related to a given response or attitude.) 
What kind of sequential procedure? How much can we rely on routine 
item analysis techniques? How does experiment for insight differ from 
experiment for prediction? К P 

(28) How many experimental statisticians axe aware of the problems of 
astronomy? What is there in Trumpler and Weaver's book [32] that is 
new to most experimental statisticians? What in other observational 
problems like the distribution di nebulae (e.g. [23, 26])? 

(24) How many experimental statisticians are aware of the problems 
of geology? What is there in the papers on statistics in geology in the 
Journal of Geology for November 1953 and January 1954 that is new 
to most experimental statisticians? What untreated problems are sug- 
gested there? 

(26) How many experimental statisticians are aware of the problems 
of meteorology? What is there in the pooks of Conrad and Pollak [9] and 
of Carruthers and Brooks [7] that is new to most experimental statis- 
ticians? What untreated próblems are suggested there? e 

(86) How manysexperimental statisticians are aware of the problems 


724 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


of particle size distributions? What is there in Herdan’s book [21] on 
small particle statistics that is new to most experimental statisticians? 
What untreated problems are suggested there? 

(27) What їз the real situation concerning the efficiency of designs with 
self-adjustable analyses—lattices, self-weighted means, etc.—as com- 
pared with their apparent efficiency? Meier [25] has attacked this prob- 
lem for some of standard cases, but what are the repercussions? What 
will happen in other cases? Is there any generally applicable rule of 
thumb which will make approximate allowance for the biases of un- 
sophisticated procedures? 

(28) How can we bring the common principles of design of experiments 
into psychometric work? How can we make allowance for order, prac- 
tice, transfer of training, and the like through specific designs? Are 
environmental variations large enough so that factorial studies should 
always be done simultaneously in a number of geographically separated 
locations? Don't we really want to factor variance components? If so, 
why not design psychometric experiments to measure variance com- 
ponents? 

(29) How soon will we appreciate that the columns (or rows) of a con- 
lingency table usually have an order? When there is an order, shouldn't 
we take this in account in our analyses? How can they be efficient 
otherwise? Should we test only against ordered alternatives? If not, 
what is a good rule of thumb for allocating error rates? Yates [40] 
has proposed one-technique. What of some others and a comparison 
of their effectivenesses? 

We come now to a set of questions which belong in the list, but which 
we shall treat only briefly since substantial work is known to be in 
progress: * 

(30) What usefully can be done with mxn contingency tables? 

(31) What of a very general treatment of variance components? 

(32) What should we really do with complex analyses of variance? 

{ (83) How can we modify means and variances to provide good effi- 
ciency for underlying distributions which may or may not be normal? 

(84) What about statistical techniques for data about queues, tele- 
phone traffic, and other similar stochastic processes? 

(85) What are the possibilities of very simple methods of spectral 
analysis of time series? 

(86) What are the variances of cospectral and quadrature spectral 
estimates in the Gaussian case? ~ 

(87) What are useful general representations for higher moments of 
stationary time series? ар ; gne 


UNSOLVED PROBLEMS OF EXPERIMENTAL STATISTICS 725 


Next we revert to open questions: 

(88) How should we measure and analyze data where several coordi- 
nates replace the time? What determines the efficiency of a design? 
Should we use numerical filtering followed by conventional analysis? 
How much can we do inside the crater? ; 

(89) What of an iterative approach to discrimination? Can Penrose's 
technique [28] be usefully applied in a multistage or iterative way or 
both? Does selecting two composites from each of several subgroups 
and then selecting supercomposities from all these composites pay? If 
we remove regression on the first two composites from all variables, 
can we usefully select two new composites from among the residuals? 

(40) Can the Penrose idea be applied usefully to other multiple regres- 
sion situations? Can we use either the simple Penrose or the special 
methods suggested above? 

(41) Is there any sense in seeking a method of “internal discriminant 
analysis”? Such a method would resemble factor analysis in resting on 
no external criterion, but might use discriminant-function-like tech- 
niques. 

(42) Why is there not a clearer discussion of higher fractionation? 
Fractionation (by which we include both fractional factorials and con- 
founding) is reasonably well expounded for the 2" case. But who can 
make 3", 4", 5" etc. relatively intelligible?” . 

(48) How many useful fractional factorial designs escape the queunt 
group theoretical techniques? After all, Latin Squares are kths of a k’, 
and most transformation sets do not correspond to simple group 
theory. 

(44) In many applications of higher fractionals, the factors are scaled— 
why don’t we know more about the confounding*of the various orthogonal 
polynomials and their interactions (products)? Even a little inquiry shows 
that some particular ev are much better than others of the 
same type. 

(45) What about redundant eo of iced factorials? We know 
perfectly well that there is no useful simple (nonredundant) fraction 
of а 2334, but there may be a redundant one, where we omit some 
observations i in estimating each effect. What would it be like? 

A number of further provocative questions have been suggested by 
others as a result of the distribution of advance copies of this paper 
and its oral*presentation. I indicate some of them in my own words 
and attitude: 

(46) To what extent should” we emphasize the practical poter of a test? 
Here the practica] power is defined ås the product of the probability 


726 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


2 


of reaching a definite decision given that a certain technique is used 
by the probability of using the technique. (C. Eisenhart) 

(47) What of regression with error in x? Are the existing techniques 
satisfactory in the linear case? What of the nonlinear case? (K. A 
Brownlee) M 

(48) What of regression when the errors suffer from unknown auto- 
correlations? What techniques can be used? How often is it wise to use 
them? (K. A. Brownlee) 

(49) How can we make it easier for the statistician to “nsychoanalyze” 
his client? What are his needs? How can the statistician uncover them? 
What sort of a book or seminar would help him? (W. H. Kruskal) 

(60) How can statisticians be successful without fooling their clients to 
some degree? Isn’t their professional-to-client relation like that of a 
medical man? Must they not follow some of the principles? Do statis- 
ticians need a paraphrase of the Hippocratic Oath? (W. H. Kruskal) 

(61) How far dare a consultant go when invited? Once a consultant is 
trusted in statistical analysis and design, then his opinion is asked on 
a wider and wider variety of questions. Should he express his opinion 
on the general direction that a project should follow? Where should he 
draw the line? (R. L. Anderson) 

In closing these questions, it should not be necessary to remind the 
reader that neither in the last section of examples or in this section of 
próvocative questions have we tried to suggest an order of importance 
for the unsolved ‘questions suggested. We leave that to the reader. 


TOOL BUILDING VS. PROBLEM SOLVING 


To judge from published books and articles, experimental statistics 
has grown by finding tools somehow, and then running around using 
them. (This impression is undoubtedly somewhat inaccurate.) Why has 
experimental statistics not been more obyiously concerned with prob- 
lems? Partly, perhaps, because it is just beginning to get its growth. 
Partly, perhaps, because dealing with problems is difficult and likely 
to lead to approximate solutions. These are valid reasons, but not 
valid excuses. 

As experimental statistics grows toward maturity, it surely should 
orient more toward areas rather than toward techniques. How much 
more may be a question. But an essential prerequisite to such reorienta- 
tion is some picture of what are the areas. This picture will not spring 
forth full armed, but will come ffom much work and discussion. As аһ 
attempted “trigger for this work and discussion, the next section pre- 
sents a feeble first attempt at classification. Reader, can you do better? 


UNSOLVED PROBLEMS OF EXPERIMENTAL STATISTICS 727 


A FEEBLE GUIDE TO AREAS 


We shall set up with a digital classification, but without prejudice 
as to whether the classification provided by one digit is crossed with 
or nested inside that provided by another. The digits provided will 
usually not specify an aréa completely, but they will usually narrow 
the situation down to a small number of areas. 

The first digit classification refers to the general end of the analysis 
as follows: 

(The assessment of, or determination of a wise action in view of) 


(1) Typical response 

(2) Variability of response 

(8) Distribution of response 

(4) Concealed structures and their coefficients 

(5) Control charts and other “spotting” procedures 
(9) Miscellaneous 


(If answers are expressible in simple or mixed cumulants, then the 
degree of these cumulants with respect to response variables is control- 
ling. (1) contains cases of degree 1; (2) contains cases of degree 2; (3) 


coefficients as well as means, while correlation analysis considered as a 
study of predictability comes under (2), Contingency tables fall under 
(1), except when the issue is homogeneity, when they fall under (2). 
Factor analysis seems better placed under (4) than under (2), but 
structural regression, as practiced in econometrics, seems to fall most 
naturally under (1). . 

The second digit classification refers to the sitatfon of measurement, 
ne in description at least, has to be subordinated to the first digit. 
t runs 


(-1) Isolated (one or a few) regponses, isolated (one or a few) vari- 
abilities, isolated (one or a few) distributions, etc. 

(-2) Response curves or surfaces, variabilities as functions of en- 
vironmental variables, etc. 

(-3) Inverse responses (what environment(s) produces a given re- 
sponse), inverse variabilities, etc. 

(-4) Response to nonenvironmental variable (e.g. time shape of 
pulses, distribution of grain sizes, power spectrum of time 
series.) . 

(-9) Miscellaneous eiie . 


4 All of bioassay and gensitivity testing will of course be found in (-3). 


à contains cases of higher degree.) Under (1) are included regression _ 


728 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


Problems of maximization of response by altering quantitative vari- 
ables fall best into (-2), since attempts to put them into (-3) as the 
search for that environment where the derivatives vanish seem unwise. 

The third digit classification refers to the nature of the measurement, 
and is easy to apply, namely 


(--1) Absolute measurements without calibration problems 

(--2) Intermediate cases 

(--8) Absolute measurements by comparison with a standard 

(—4) Comparative measurements among а family without calibra- 
tion problems 

(--6) Intermediate cases 

(--6) Comparative measurements among a family with the aid of 
standards 

(--9) Miscellaneous 


The conventional problems of bioassay fall in (--3), while sensitivity 
to explosion or breakage problems based on falling weights may fall in 
(--1). Conventional comparisons of varieties and fertilizers are usually 
thought to fall in (--4), but must, in many cases, fall in (--4). 

The fourth digit expresses the kind of response considered, and is 
again easy to apply. The classes are: 

A 1) Diréctly measured responses 

2__, 2) Responses measured as slopes or regression coefficients 
(---, 8) Adjusted responses (as by covariance) 


No examples seem to be needed. 
The fifth digit specifies the nature of the response, as follows: 


(---, -1) Measured response (on reproducible scale) 
(---, 32) Scored or rated response (by judge or panel) 
(---, -3) Counted (all-or-none) response 

(---, -9) Miscellaneous ^ 


At the present, the impact of this digit on statistical technique is very 
noticeable. Should it remain so? 


"The sixth digit specifies the complexity of the response, as follows: 


(---, --1) Single variate response 

(---, --2) Bivariate response 
(and so on) 

(---, --8) Many variate response 


(---, --9) Miscellaneous 
Examples here are not needed. : 


| UNSOLVED PROBLEMS OF EXPERIMENTAL STATISTICS 729 
d The seventh digit describes the complexity of the environments con- 
sidered, as follows: 
(---, ---, 1) Environment varied only randomly 
---, ---, 2) Environmené varied in one measured way 
(---, ---, 8) Environnfent varied randomly and in one measured 


| 

| 

| way, 
| 4) Environment varied in two measured ways 

=, 5) Environment varied іп a more complex manner 
‚ 9) Miscellaneous 


] 

! 

| 

3 

REFERENCES 

[1] Bechhofer, Robert E., Dunnett, Charles W., and Sobel, Milton, “A two- 

sample multiple decision procedure for ranking means of normal populations 

with unknown variances,” Annals of Mathematical Statistics, 24 (1953), 136 

(abstract). 

[2] Bechhofer, Robert E., and Sobel, Milton, “A sequential multiple decision 

procedure for ranking means of normal populations with known variances,” 

Annals of Mathematical Statistics, 24 (1953), 136 (abstract). 

[3] Berkson, Joseph, “‘Cost-Utility’ as a measure of the efficiency of a test,” 

Journal American Statistical Association, 42 (1947), 246-55. 

[4] Box, G. E. P., “Multifactor designs of first order,” Biometrika, 39 (1952), 

49—57. 

[5] Box, G. E. P., and Wilson, K. B., “On the experimental attainment of opti- 

mum conditions,” Journal of the Royal Statistical Society, B13 (1951), 4-45. 

[6] Brown, George W., “Basic principles for construction apd application of dis- 

criminators,” Journal of Clinical Psychology, 6 (1950), 58-60. 

[7] Brooks, C. E. P. and Carruthers, N., Handbook of statistical methods in 

meteorology. London, H. M. StationeryeOffice (1953). 

[8] Cochran, W. G., “Improvement by means of selectign," Proceedings of 2nd 

UN Symposium on Mathematical Statistics dnd Probability (1951), 449- 

0. is 

[9] Conrad, Victor, and Pollak, L. W., Methods in Climatology, 2nd Edition, 

Cambridge, Mass., Harvard University Press (1950), pp. 499. 

[10] DeLury, D. B., “The analysis of covariance,” Biometrics, 4 (1946), 153-70. 

[11] Duncan, David B., “A significance test for differences between ranked treat- 
ments in an analysis of variance" Virginia Journal of Science, 8 (1951), 172- 
89 (abstract in Annals of Mathematical Statistics, 22 (1951), 142). 

[12] Duncan, David B., “On the properties of the multiple comparisons test,” . 
Virginia Journal of Science, 3 (1952), 49-67. 

[13] Duncan, David B., “Multiple range and multiple F tests.” Presented to 
meeting of American Chemical Society, 1958, also Technical Report 6a, 
Departmént of Statistics and Statistical Laboratory, Virginia Polytechnie 
Institute. 2i. 

[14] Dunnett, Charles W. and Sobel, Milton, «Оп a multivariate analogue ot у 
Student's t-distribution, with some tables for the bivariate case.” Annals of У 
Mathematical Statistics, 24 (1953), 492 (abstract). 


===> 


780 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


[15] Fisher, R. A., The Design of Experiments, First Edition, Edinburgh, Oliver 
and Boyd (1936). 

[16] Friedman, Milton, and Savage, L. J., “Planning experiments seeking max- 
ima,” Chap. 13 of Statistical Research Group, Columbia University, Tech- 
niques of Statistical Analysis (edited by Churchill Eisenhart, Millard W. 
Hastay, and W. Allen Wallis), New York, M2Graw-Hill (1947). 

[17] Goulden, C. H., Methods of Statistical Analysis, 1st Edition, New York, 
John Wiley and Sons (1939), especially Chapter XI, Section 6 (pp. 122 ff.); 
2nd Edition 1952, especially Chapter 5, Section 13 (pp. 90 ff.). 

[18] Hald, Anders, Statistical Theory with Engineering Applications, New York, 
John Wiley and Sons (1952). 

[19] Haldane, J. B. S., “A note on non-normal correlation, ? Biometrika, 36 (1949), 
467-68. 

[20] Healy, M. J. R., “Decision between two alternatives—how many experi- 
ments,” Paper at the third international biometric conference, Bellagio, 
September, 1953. 

[21] Herdan, G., Small particle statistics, Amsterdam-Houston-New York-Paris, 
Elsevier (1953) 

[22] Hitchman, Norman, “What is the mission of operations research?” J ournal 
of the Operations Research Society of America, 1 (1953), 241-42. 

[23] Limber, D. Nelson, “The analysis of counts of the extragalactic nebulae in 
terms of a fluctuating density field,” Submitted to Astrophysical Journal. 

[24] Link, Richard F., “Some Statistical Techniques Useful for Estimating the 
Mean Life of a Radioactive Source,” Doctoral thesis, Princeton University, 
1953. 

[25a] Meier, Pau! R., “Weighted means and lattice designs,” Doctoral thesis, 
Princeton University. 

BEB) Melee Paul R,, “Variance of a weighted mean,” Biometrics, 9 (1953), 59- 


[26] Newman, J., Scott, E. L., and Shane, C. D., “On the spatial distribution of 
the galaxies. A specific model,” Astrophysica Journal, 117 (1953), 92-138. 

[27] Panofsky, H. A., and McCormick, R. A., “The vertical momentum flux at 
Brookhaven at 109 meters," Geophysical Research Papers, 19 (International 
Symposium Atmos. Turbulence Boundary Layer) (1952), 219-30. 

[28] Penrose, L. S., “Some notes dn discrimination,” Annals of Eugenics, 18 
(1946-7), 228-37. { > 

[29] Pierson, Willard J., Jr., A unified mathematical theory for the analysis, propa- 
gation and refraction of storm generated ocean surface waves. Department of 
Meteorology, New York University, 1952. 

[30] Robbins, Herbert, and Monro, Sutton, “A stochastic approximation meth- 
od,” Annals of Mathematical Statistics, 22 (1951), 400-7. 

[31] Somerville, Paul N., “Optimum sample size for choosing the largest of 
kl parameters," Paper at the Institute of Mathematical Statistics meet- 
ing Kingston, Ontario, 3 September 1953. 

[32] Trumpler, Robert J., and Weaver, Harold F., Statistical Astrunomy, Berke- 
ley, University of California Preso (1953). 

[33] Tukey, John W., “Purposes of fiducial inference,” Paper before the Instituie 
of Mathematical Statistics, Minneapolis, 6 September 1951. 


UNSOLVED PROBLEMS OF EXPERIMENTAL STATISTICS 731 


» 34] Tukey, John W., “Allowances for various types of error rates,” Paper before 

Institute of Mathematical Statistics, Blacksburg, 19 March 1952. 

35] Tukey, John W., “Multiple Comparisons,” Paper before American Statisti- 

cal Association and Biometric Society, Chicago, 28 December 1952. 

[36] Tukey, John W., “Various methods from a unified point of view," Paper be- 

fore Institute of Mathematical Statistics, Chicago, 29 December 1952. 

l [87] Tukey, John W., The problem of multiple comparisons, In preparation. 

| 38] Walsh, J. E., “Some significance tests for the median which are valid under 

| very general conditions,” Annals of Mathematical Statistics, 20 (1949), 64-81. 

39] Wilcoxon, Frank, Some rapid approximate statistical procedures, Insecticide 
and Fungicide Section, Stanford Research Laboratories, American Cyani- 


mide Co., 1948. 
40] Yates, Frank, “The analysis of contingency tables with groupings based on 


quantitative characters,” Biometrika 35 (1948), 176-81. 
[41] Yates, F., “Principles governing the amount of experimentation in develop- 


mental work.” Nature, 170 (1952), 138-40. 


MEASURES OF ASSOCIATION FOR 
CROSS CLASSIFICATIONS* 


Leo A. Goopman AND УїпллАм Н. KRUSKAL 


University of Chicage, 
CONTENTS 
е UU CRON a Meena Rater ree Scrat. КОРСИКА 
2. Four PRELIMINARY Сохв1рЕВАт1О®В............................. 
онон АА СОАО ДИННИН a 
2.2, Order..... 


2.4. Manner of Formation of the Classes. . 

. CONVENTIONS........... 
- TRADITIONAL MEASURES.................. : 
. Muasures BASED ON OPTIMAL PREDICTION.... een 
5.1. Asymmetrical Optimal Prediction. A Particular Model of Activ- 
A e оша о ОРОНОТ 

5.2. Symmetrical Optimal Prediction. Another Model of Activity.... 
Р САР ВЕ о АБОНАТ 
5.4. Weighting Columns ог Коув.......;....:....... 
6. Measures BASED UPON OPTIMAL PREDICTION OF ORDER..... 


e № Ф 


6.2. A Proposed Measure. 
CUS) ERD TAS oe A OS i AIRES E EA, 


7. Tum GENERATION оғ MEASURES BY THE INTRODUCTION oF Loss 
2. FUNCTIONS 


8.1. Generalities....... 
8.2. A Measure of Relial 


13. SAMPLING PROBLEMS. . 
14. Сохсірріма REMARKS. 


Papes is partly an outgrowth of work sponsored Jy the Army, Navy, and Air Force th 
Advisory Committee for Research Groups in Applied Mathematics and Statist 
oA t No. NGori-02035. We are indebted ior helpful comments and criticisms to: Otis D. 
(University of Chicago), Churchill Eisenhart (National Bureau of Standards), Maurice G- 


S 732 


? 


MBASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 733 


When populations are cross-classified with respect to two 
or more classifications or polytomies, questions often arise 
about the degree of association existing between the several 
polytomies. Most of the traditional measures or indices of as- 
sociation are based upon the standard chi-square statistic or 
on an assumption of underlying joint normality. In this paper 
a number of alternative measures are considered, almost all 
based, upon a probabilistic model for activity to which the 


cross-classification may typically lead. Only the case in which - 


the population is completely known is considered, so no ques- 
tion of sampling or measurement error appears. We hope, 
however, to publish before long some approximate distribu- 
tions for sample estimators of the measures we propose, and 
approximate tests of hypotheses. Our major theme is that the 
measures of association used by an empirical investigator 
should not be blindly chosen because of tradition and con- 
vention only, although these factors may properly be given 
some weight, but should be constructed in a manner having 
operational meaning within the context of the particular prob- 
lem. 


1, INTRODUCTION 
ANY studies, particularly in the social sciences, deal with popula- 
tions of individuals which are thought of as cross-classified by 
two or more polytomies. For example, the adult individuals living in 
New York City may be classified as to $ 


Borough: 5 classes 
Newspaper most often read: perhaps 6" classes 
Television set in home or not: 2 classes 

Level of formal education: perhaps 5 classes 
Age: * perhaps 10 classes 


. є 
For simplicity we deal largely with the case of two polytomies, although 
many of our remarks may be extended to a greater number. The double 


polytomy is the most common, no doubt because of the ease with which | 


it can be tabulated and displayed on the printed page. Most of our 
Temarks suppose the population completely known in regard to the 
classifications, and indeed this seems to be the way to begin in the 
construction of rational measures of association. After agreement has 
been reached on the utility of a measure for a known population, then 


(London School of Economics and Political Science), Frederick Mosteller (Harvard University), 
І Richard Savage (National Bureau of Standards), Alan Stuart (London School of Economics and 

litical Science), Louis L. Thurstone (University of North Carolina), John W. Tukey (Princeton 
(University), W. Allen Wallis (University of Chicagof, and Е. J. Williams (Commonwealth Scientific 
and Industrial Research Organization,  Ausiralia). Part of Mr. Goodman's work on thig paper was car- 
vas cut at the Statistical Laboratory of the University of Cambridge under а Fulbright Award and s- 

ocial Science Research Coungjl Fellowship. The author$ were led to work on the problems of this paper 
35 а result of conversations ith Louis L. Thurstone and Bernard В. Berelson. 


LI 


=: 


734 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


one should consider the sampling problems associated with estimation 
and tests about this population parameter. 

A double polytomy may be represented by a table of the following 
kind:! 1 


B 
A 
Bi В, WEE Bg Total 
A, pu Piz PAPAT pis pi. 
A, pn pa 20 Рэв рз. 
Аа Pat раз ona Pag Pa 
па. аши 
Total pa р.з e р.в 1 
(Бана А ЕАО el 


where 


Classification “A divides the population into the a classes 
Ai, Ax, +++, Aw 

Classification B divides the population into the В classes 
Bi, By, +++, BR ue 

The proportion of the population that is classified as both A. and 
B b i8 Pab. N 


"The marginal proportions will be denoted by 
ра. =the proportion of the population classified as Aa. 
p.»=the proportion of the population classified as B». 


1f the use to which a measure of association were.to be put could be 
precisely stated, there would be little difficulty in defining an appropri- 
ate measure. For example, using the above cross-classification of the 
New York City population, a television service company might wish to 


. 1 Tables of this kind are frequently called contingency tables. We shall not use this term because of 
its connotation of a specific sampling scheme when the population is not known and one infers on the 
"Basis of a sample, i 


е 


[BASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 735 


place a single newspaper advertisement which would be read by as 
my prospective customers as possible. Then the important informa- 
on from the table of newspaper-most-often-read vs. television-set-in- 
ome-or-not would be: which newspaper is most often read among 
_ those with television sets? And a reasonable measure of association 
— would simply be the proportion of those with television sets who read 
this newspaper. . 
It is rarely the case, however, that the purpose of an investigation 
can be so specifically stated. More typically an investigation is ex- 
ploratory or has a multiplicity of goals. Sometimes a measure of associ- 
_ ation is desired simply so that a large mass of data may be summarized 
— compactly. 
_ The basic theme of this paper is that, even though a single precise 
goal for an investigation cannot be specified, it is still possible and 
| desirable to choose a measure of association which has contextual 
| 4 meaning, instead of using as a matter of course one of the traditional 
— measures. In order to choose a measure of association which has mean- 
— ing we propose the construction of probabilistic models of predictive 
| activity, the particular model to be chosen in the light of the particular 
— investigation at hand. The measure of association will then be а prob- 
_ ability, or perhaps some simple function of probabilities, within such a 
- model. Such is our general contention; most of the remainder of this 
_ paper is concerned with its exemplification in particular instances. * 
| We wish to emphasize that the specific measures f association de- 
= scribed here are not presented as factotum or universal measures. 
| Rather, they are suggested as reasonable for use in appropriate circum- 
Stances only, and even in those circumstances оће measures may and 
3 “Should be considered and investigated. x 
A good deal of attention has been paid in the literature to the special 
| баке of two dichotomies. We are,move interested here in measures of 
Е "association suitable for use with any numbers of classes in the polyto- 
" mies or classifications. 
2. FOUR PRELIMINARY CONSIDERATIONS y 


1 Four distinctions or cautionary remarks should be made early in any 
discussion of measures of association. 


3 -2.1. Continua 


| We may or may not wish to think ‘of a polytomy as arising from an 


underlying continuum. For example, age may for conveniénce be di> < -== 
ә ‹ 


——— '=experiment in one direction only. "Оп the other hand, there is often n 


736 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER | 


vided into ten classifications, but it clearly does arise from an unde 
lying continuum; however, newspaper-most-often-read would sca 
be so construed. If a polytomy does arise from an underlying continui 
one may or may not wish to assume that the population has some spe- 
cific kind of distribution with respect to 1t. _ ч 

In those cases їп which all the polytomies of a study arise jointly from е 
a multivariate normal distribution on an underlying continuum, one d 
would naturally turn to measures of association based on the correla- 
tion coefficients. These in turn might well be estimated from a sample: $ 
by the tetrachoric correlation coefficient method or a generalization of | 
it. In some cases one polytomy may arise from a continuum and the 
other not. An interesting discussion of this case for two dichotomies 
was given in 1915 by Greenwood and Yule ([3], Section 3). We do 
not discuss either of these cases in this paper, but restrict ourselves to 
situations in which there are no relevant underlying continua. W- 

The desirability of assuming an underlying joint continuum was one - 
of the issues of a heated debate forty years ago between Yule [15] on 
the one hand and K. Pearson and Heron [9] on the other. Yule's | 
position was that very frequently it is misleading and artificial to _ 
assume underlying continua; Pearson and Heron argued that almost 
always such an assumption is both justified and fruitful. iS 


2.2, Order 


There may or may not be an underlying order between the clas 
cations of a polytomy. For example “level of formal education” admit 
an obvious ordering; but borough of residence would not usually 
thought of in an ordered way. If there is an ordering, it may or may 
not be relevant to the investigation, Sometimes an ordering may be 
important but not its direction. If there is an underlying one-dimen 
sional continuum, it establishes an ordering. 

When there is no natural or relevant ordering of the classes of a 
polytomy, one may reasonably ask that a measure of association noU 
depend on the particular order in which the classes are tabulated. 


Ж 
m 


2.3. Symmetry " й 

It may ог may not be that one looks at two polytomies symme 
cally. When we are sure a priori that a causal relationship (if it exists 
runs in one direction but not the other, then our viewpoint wi 
asymmetric. This will also happen if one plans to use the results o! 


ME 


reason to give one polytomy precedence over another. ) 


MEASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 787. 
2.4, Manner of Formation of the Classes (E 


Decisions about the definitions of the classes of a polytomy, or 
changes from a finer to a coarser classification (or vice-versa), can 
affect all the measures of association of which we know. For example, 
suppose we begin with the, 4X4 table 


which might greatly change a measure of association. Or we might 
combine the three bottom rows and the three right-hand columns. 
This gives . 


which presents quite a different intuitive degree of association. By 
other poolings one*can obtain other 2X2 tables. d 
Although this example is extreme, similar changes can be made in 
the character of almost any cross-classification table. Related examples 
are discussed*by Yule [15]. Д 
At first this consideration might sebm to vitiate any reasonable dis- 


cussion of measures of associstion. We feel, however, that it is in fact 


desirable that a measure of assctiatiom reflect the classes as defined for 


~ 


738 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


the data. Thus one should not speak, for example, of association between 
income level and level of formal education without specifying particular 
class definitions. Of course, in many cases association—however meas- 
ured—would not be much affected by any reasonable redefinition of 
the classes, and then the above finicky form of statement can be simpli- 
fied. That the definition of the classes can affect the degree of associa- 
tion naturally means that careful attention should be given to the class 
definitions in the light of the expected uses of the final conclusions, 


3. CONVENTIONS 


It is conventional, and often convenient, to set up a measure of 
association so that either 


(i) It takes values between —1 and +1 inclusive, is —1 or +1 in 
case of “complete association,” and is zero in the case of inde- 
pendence. 

(ti) It takes values between 0 and +1 inclusive, is +1 in the case 
of “complete association,” and is zero in the case of inde- 
pendence. 


Convention (i) is appropriate when the association is thought of as 
signed (e.g., association between income and dollars spent is positive, 
between income and per cent of income spent is negative). Convention 
(it) is appropriate when no such sign considerations exist, as when 
there is no natural order, 

“Complete association,” as we shall see, is somewhat ambiguous. 
“Independence,” on the other‘hand, has its usual meaning, that is 


(1) Pas = pa-p.»(@=1,-+-,a;b=1,---, B). 


Conventions like these have ‘seemed important to some authors, but 
we believe they diminish in importance as the meaningfulness of the 
measure of association increases. One real danger connected with such 
conventions is that the investigator may carry over size preconceptions 
based upon experience with completely different measures subject to 
the same conventions. For example, some elementary statistics text- 
books warn that a population correlation coefficient less than about .5 
in absolute value may have little practical significance, in the sense 
that then the conditional variance is not much less thar the marginal 
variance, Research workers in various fields thus tend to develop rather 

___-.. . Strong feelings that population correlation coefficients less than, Вау, 
.5, have little substantive importance. The same feelings might be 


MEASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 739 


carried over, without justification, to all other measures of association 
so defined as to lie between +1 and —1. 

Tt should also be mentioned that once one has a measure of associ- 
ation satisfying one of the above conventions, then an infinite number 
of others also satisfying the' same convention can be obtained—for 
example, by raising to a power and adjusting the sign if necessary. 


4. TRADITIONAL MEASURES 


Excellent accounts of these may be found in [16], Chaps, 2 and 3, 
and [7], Chap. 13. Many of these stem from the standard chi-square 
statistic upon which a test of independence is usually based. If a finite 
population has > members and we set vo» = vpab, Va: =VPa-, V-b=VP-b, ete., 
the chi-square statistic in the case of two classifications is 


Vab — Va-¥./v)? — pa-pa)? 
Q) x yr 70) erre Pa-P-b) 
a b a 


Ya-V/ V ра+р+ъ 


Y 28M 


a b Pap. 


A great deal of attention has been given to the case а= =2. For 
this special case Yule has defined the following coefficient of association: 


(8) "lc Vues — Via й 
viwa + nwa 

whose numerator squared is essentially the same as that of a convenient 

and popular form for x? in the 2X2 case. Another goefficient suggested 

by Yule for the 22 case is R 


PE poder 
Aye — Vreven 
uvae vnia 


A coefficient often used for the general «X£ case is simply x'[», often 
called the mean square contingency and denoted by ¢*. A variation 
of this, suggested ky Karl Pearson, is 4 


(4) 


(5) pw C= be]. 

. 1+ xii » 
Which has been called the coefficient: of contingency, or the coefficient à 
of mean square contingency. Another variation, proposed by Tschu—— — 
е * 


prow, is = 


740 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 — 


(6) T = v/De/v]/(a — 1)(8 — 1). 


The last two suggestions, according to Kendall [7], were made in at- ~ 
tempts to norm x? so that it might lie between 0 and 1 and take the 
extreme values under independence and “complete association." | 
Cramér ([1], р. 282) suggests the following variant: 


(7) [x?/v]/Min (« — 1, 8 — 1) 


which gives a better norming than does C ог T since it lies between 0 - 
and 1 and actually attains both end points appropriately. Cramér’s 
suggestion does not seem to be well known by workers using this gen- - 
eral kind of index. 
The fact that an excellent test of independence may be based on x? 
does not at all mean that x’, or some simple function of it, is an ap- 
propriate measure of degree of association. A discussion of this point - 
is presented by R. A. Fisher ([2], Section 21). We have been unable to - 
find any convincing published defense of x?-like statistics as measures - 
of association. р, 
One difficulty with the use of the traditional measures, ог of any — 
measures that are not given operational interpretation, is that it is 
difficult to compare meaningfully their values for two cross-classifica- 
tions. Suppose that C turns out to be .56 and .24 respectively in two 
cross-classification tables. One wants to be able to say that there is 
higher association in the first table than the second, but investigators 
sometimes restrain themselves, with commendable caution, from” 
making such a comparison, Their restraint may stem in part from ће 
noninterpretability.of C. (Of course, when samples are small they may ^ 
also be restrained by inadequate knowledge of sampling fluctuation.) — 
One class of measures that will not be discussed here is characterized | 
by the assignment of numerical, scores to the classes, followed by the 
use of the correlation coefficient on: these scores. A recent article. 
such measures is by E. J. Williams [12]. It contains references leading 
back to earlier literature. We feel that the use of arbitrary scores t0 
motivate measures is infrequently appropriate, but it should be pointed 
out that measures not motivated by the correlation‘of scores can often 
be thought of from the score viewpoint. D. 


5. MEASURES BASED ON OPTIMAL PREDICTION 


5.1. Asymmetrical Optimal Prediction. A Particular Model of Activ 


7 Let us consider first a probabilistic jnodel which arent be useful 
a situation of the following kind: 


MEASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 741 


(i) Two polytomies, A and B. 
(ii) No relevant underlying continua. 
(iii) No natural ordering of interest. 
(wv) Asymmetry holds: The A classification precedes the B classifi- 
cation chronologically, causally, or otherwise. 


An example of such a situation might be a study of the association 
between college attended (A) and kind of adult occupation (B). Our 
model of activity is the following: An individual is chosen at random 
from the population and we are asked to guess his B-class as well as 
we can, either 


1. Given no further information, or 
2 Given his A class. 


Clearly we can do no worse in ease 2 than in case 1. Represent by p.m 
the largest marginal proportion among the B classes and by pam the 
largest proportion in the ath row of the cross-classification table—that 
is 


(8) Р.т = Max pa; Pam = es Pad * 
b 


Then in case 1 we are best off guessing that В, for which p.) p. —that 
is, guessing that B class which has the largest marginal proportion—and 
our probability of error is 1—p.m. In case 2 we are best off guessing that 
By for which p,,— pam (letting A, be the given A class)-«-that is, guessing 
that B class that has the largest proportion in the observed A class— 
and our probability of error is? 1— J apaw: 

Then we propose as a measure of association (following Guttman [4]) 

е 


p (Prob. of error in ease 1) — (Prob. of error in case 2) 
(Prob. of error in case 1) 
> Pam — D © 


D 


(00 X 


1 = p 
Which is the relative decrease in probability of error in guessing By as 
between A, unknown and A, known. To put this another way, X gives 
the proportion of errors that can be eliminated by taking account of 
knowledge of-the A classifications of individuals. 
Some important properties of 2; follow: 


е Ше 
1% may be that in case 1 there is more than one b for which р. =р„л. Then апу method of оов 


which of these b’s to guess—ineluding flippingfan appropriately multi-sided die—gives rise to the samo 
Probability of error, 1—p.m.fA similar comment applies to case 2, 


742 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


(i) X, is indeterminate if and only if the population lies in one 
column, that is, lies in one B class. 

(ii) Otherwise the value of A; is between 0 and 1 inclusive. 

(iii) A, is 0 if and only if knowledge of the A classification is of no 
help in predicting the B classification, i.e., if there exists a by 
such that pab = pam for all a. 

(iv) Ay is 1 if and only if knowledge of an individual's A class com- 
pletely specifies his B class, i.e., if each row of the cross-classifi- 
fication table contains at most one nonzero pas. 

(v) In the case of statistical independence A», when determinate, is 
zero. The converse need not hold: \, may be zero without sta- 
tistical independence holding. 

(vi) Хь is unchanged by permutation of rows or columns. 


That А» may be zero without statistical independence holding may 
be considered by some as a disadvantage of this measure. We feel, 
however, that this is not the case, for X; is constructed specifically to 
measure association in a restricted but definite sense, namely the pre- 
dictive interpretation given. If there is no association in that sense, 
even though there is association in other senses, one would want X, to 
be zero. Moreover, all the measures of association of which we know 
are subject to this kind of.criticism in one form or another, and indeed 
it seems inevitable. To obtain a measure of association one must 
sharpen the definition of association, and this means that of the many 
vague intuitive notions of the concept some must be dropped. 

We may similarly define 


> È pms — Pm- 
(10) “ye ; 
2 oem: 
where 
Pm: = Max Pa- 
(11) н 


Pm = Max pas. 
a 


Thus X, is the relative decrease in probability of error in guessing As 85 
between B, unknown and known. 

So far as we know, X, and A; were first suggested by Guttman ((4) 
Part I, 4), and our development of them is very similar to his. 


52. Symmetrical Optimal Prediction. Another Model of Activity 
In many cases the situation is syminetrical, and one may alter the 


MEASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 743 


model of activity as follows: an individual is chosen at random from 
the population and we are asked to guess his A class half the time (at 
random) and his B class half the time (at random) either given: 


1. No further information,or 
2. The class of the individual other than the one being guessed; that 
is the individual's A, when we guess B, and vice versa. 


In case 1 the probability of error is 1—}(o.m-++pm.), and in case 2 the 
probability of error is 1—3( 27a р 22 pms). Hence we may consider 
the relative decrease in probability of error as we go from case 1 to 
case 2, and define the coefficient 


3[ 32 pom + У ре» — P-m — Рт] 
1-340922) ; 


Some properties of ^ follow: 


(i) А is determinate except when the entire population lies in a 
single cell of the table. 
(ii) Otherwise the value of is between 0 and 1 inclusive. 
(ti?) Xis 1 if and only if all the population is concentrated in cells no 
two of which are in the same row or column. 
(iv) А is 0 in the case of statistical independence, but the converse 
need not hold. Ж 
(v) А is unchanged by permutations of rows or columns. 
(vi) А lies between A, and A; inclusive. 


The computation of №, №, or À is extremely simple. Usually one is 
given the population, not in terms of the pas’s but rather in terms of the 
numbers of individuals in each cell. Let > be the total number of indi- 
viduals in the population, уль = ура», Jam роту Vmb=VPmt, and 80 OD. 
Then e 


(12) х= 


(13) Mod : 
а y — У.т 
У vm — Ym 
(14) EU n 
M y — Vm 
E vay Уу Yb — Yom — n is d 
(15) A-—i t үз 
de 2v — (v.m + рт.) 


744 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


5.3. An example 


The following table is taken from reference [7], p. 300, and originally 
was given by Ammon in “Zur Anthropologie der Badener.” It deals 
with hair and eye color of males. The table is given in terms of the 
vat’s. Аз, Аз, Аз are respectively Blue, Grey or Green, Brown; Bi, Bs, 
B;, B, are respectively Fair, Brown, Black, Red. , 


Eye Hair Color Group 

Color « 

Group Bi В, В, В, | Va. 
Ai 1768 807 189 4T 2811 
EH 946 1387 746 53 3132 
A; 115 438 288 16 857 
Vio 2829 2632 1223 116 |>=6800 

We have: 

Vim = 1768 Ym = 1768 

Vom = 1387 Vma = 1387 

Vim = 438 Yims = 746 

Vat = 53 

v.» = 2829 4 Vm. = 3182 

8,954 — 3,132 822 
" = —————— = —— = .2241 


6,800 —'3,132 3,668 

_ 3,593 — 2,829 764 

~ 6,800 —2,829 3,971 
822 +764 _1,586 


(Quotients are given to four places.) The traditional measures of а880- 

ciation have the following values: x2/v=.1581, C=.3695, T= .2541, 

Cramér's measure — .2812. 

. This example appears as an illustration of the usual approach {0 
“Measures of association in [7], a standard statistical reference work. 

lt is not hard to think of interpretations or varistions in which one 


MEASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 745 


of the ^ coefficients would be appropriate. For example, one might be 
studying the efficacy of an identification scheme for males in which 
hair color was given but not eye color, Another example might be in 
connection with a study of popular beliefs about the relationship be- 
tween hair color and eye colof. 


5.4. Weighting Columns or Rows 


In some cases, particularly when comparisons between different 
populations are important, the measures Àa, №, or A may not be suit- 
able, since they depend essentially on the marginal frequencies. To 
put this in terms of the model of activity :in some cases we do not want to 
think of choosing an individual from the actual population at hand in 
a random way, but rather from some other population which is related 
to the actual population in terms of conditional frequencies. 

This point is stressed by Yule in reference [15] and is illustrated by 
the kind of medical example? given there. Suppose that we are con- 
cerned with the effects of a medical treatment on persons contracting 
an often fatal disease. Very large samples from two different hospitals 
are available, giving the following pas tables: 


Hospital I Hospital II 
Lived Died Total + Lived Died Total 


Treated 
Not treated 


Total 


Here the A classes are Treated or Not-treated, and the B classes Lived 
or Died. The given numbers аге різ and marginal p's. 
We are interested in the associstion between treatment and life, and 
Tunt conclude that A, would be an appropriate measure of this. We 
nd 


= .462 


.93 — .87 k 
№ fer Hospital I = 18 = 


= .686. 


tor Hospital di .84 — .56 
„ № for Hospital = Л 


" 
? We do not wish to suggest by this example that А is necessarily appropriate as @ measure of 
os] y e z Ж 
Sssociation between treatment and cure. 25 inferesting discussion of this medica? case haaber» --- 
Js by Greenwood and Үш [3] who bring’ out magy difficulties and suggest various viewpoints. 
nother interesting paper ES medical 2 X2 table is that of Youden [14]. 


746 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


Yet the conditional probabilities of life, given treatment (nontreat- 
ment), are exactly the same for both hospitals, namely .955 (.250), 
The reason that the conditional probabilities are the same while the 
М values are different is, of course, that the two hospitals treated very 
different proportions of their patients. ‘And the proportions treated 
were probably determined by factors having nothing to do with ‘in- 
herent’ association between treatment and cure. « 

It may seem reasonable in such a case as this to replace our model of 
activity by one in which an individual is drawn from the population so 
that the probability of his being in any given A, is exactly 1/a, i.e., so 
that all A classes are equiprobable; and with conditional В class prob- 
abilities equal to those of the original population. That is to say, it 
may seem reasonable to replace the quantities pas by the quantities 


(16) 1 Pa 
а ра. 


and use this as ће population to which X; is applied. We may thus de- 
fine, in terms of the conditional probabilities given Aa, 


(17) yh ke € a Pa Q >ò a Pa 
» 1-2 M S Pab 
5 a Pa 


If we do this in the present example, we get, of course, the same al- 
tered p table for both hospitals 


and in both cases 


* = 


‚628. * 


.2 
$a 898 — 
S Analogous Procedure could be'used fo define X,* and d*, Note also 


It is | KAD ; 


3 


MEASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 747 


that other ‘artificial’ marginal p's besides .5 could be used if appropri- 
ate. Yule [15] suggests as a desideratum for coefficients of association 
their invariance under transformations on the { рь} matrix of form 


Pab — SalsPab, Sa, b > Op а = 1, - уа; b—1,:--,B. 


Such a transformation may readily be found (at least when no рь=0) 
to make all four marginals of a two by two table equal to .5. In 
this connection, we refer to a recent article by Pompilj [10] in which 
such transformations are carefully discussed. 

All further measures may be considered for unweighted or weighted 
marginal proportions, whichever are appropriate. 


6. MEASURES BASED UPON OPTIMAL PREDICTION OF ORDER 
6.1. Preliminartes 


Heretofore we have considered measures of association suitable for 
the unordered case, that is, measures which do not change if the 
columns (rows) are permuted. Now we shall suggest a measure suit- 
able for the ordered case. Suppose that the situation is of the following 
kind: 

(0) Two polytomies, A and B. 

(ti) No relevant underlying continua. * 5 

(iii) Directed ordering is of interest. . 

(iv) The two polytomies appear symmetrically. „ 


By (iii) we mean that we wish to distinguish, in the 3X3 case between, 
for example, . 


calling the first of these complete association and the second complete 

Counterassociation. We may wish to make the convention that in these 

two cases the proposed measure should take the values +1 and —1 

respectively. If the sense or direction of order is irrelevant we can, for 

example, simply take the absglute yalue of a measure appropriate to 

directed ordering. А gem 
e * 


е 


748 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1064 


There are vaguenesses in the idea of complete ordered. association, 
For example, everyone would probably agree that the following case 
is one of complete association: f 


— 
0 0 0 
pa 0 0 
0 pia 0 


' As before, е procedure we shall adopt toward this and toward 
con.plex questions is to base the measure of association on a probabil 
tic model of activity which often may be appropriate and typical. 


6.2. A Proposed Measure 
Our proposed model will now be described. Suppose that two indi 
uals are taken independently and at random from the population 
(technically with replacement, but this is unimportant for large popu 
lations), Each falls into some (4., B+) cell. Let us say that the first fal 
in the (Aay Bs,) cell, and the secondain the (Aa, Be.) cell. (Under! 
letters denote random variables.) а; (i—1, 2) takes ‘values from 1 to 0j 
bi (i—1, 2) takes values from 1 to В. 
‘If there is independence, one expects that the order of the a’s ha 
no connection with the order of the Ъз. If there is high association 0 
expects that the order of the ав would generally be the same as that í 
the b's. Tf there is high counterassociation one expects that the ora 
would generally be different. 
s= Leo us therefore ask about the’ ‘probabilities for like and ан 9 


MEASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 749 


ders. In order to avoid ambiguity, these probabilities will be taken 
conditionally on the absence of ties. Set 
(8) I, = Pr {а <m and b «b; or а>@ and b> by} 
(9) JH, — Рг {а <а and b» b; or n>a and bi < bs} 
(20) = Рг {а= or bj = be}. 
Then the conditional probability of like orders given no ties is П,/(1 —П,) 
and the conditional probability of unlike orders given no ties is 
II;/ (1 — II;). Of course, the sum of these two quantities is one. 
A possible measure of association would then be П,/(1—П,), but it 
is a bit more convenient to look at the following quantity: 
П, — Il 
1— Dy 
or the difference between the conditional probabilities of like and unlike 
orders, In other words y tells us how much more probable it is to get 
like than unlike orders in the two classifications, when two individuals 
are chosen at random from the population. 
Since II, -II;— 1 —II,, we may write y as 
20, — 1+ Ih 
1.— Il, ° 


(21) qus 


(22) T 


Which is convenient for computation, using the easily checked relation- 
ships 


(23) п, =2 >> У) esl Drip Bey | 
a b >а b>b е 
(24) Te = У pa? У) ома 22 22 ра? * 
a b a 


КАЗ 
Some important properties of у follow: 


(i) y is indeterminate if the population is concentrated in a single 
row or column of the cross-classification table. 

(ii) y is 1 if the population is concentrated in an upper-left tb 
lower-right diagonal of the cross-classification table. y is —1 
if the population is concentrated in a lower-left to upper-right 
diagonal of the table. s 

. (tit) y is 0 in the case of independence, but the converse need not 
hold except in the 2X3 case, An example of nonindependence 
with y—0is | = 


e 


750 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 194 


For tables up to 5X5 with p’s expressed to two decimal places com- 
putation is fairly rapid. If many tables of the same size are at hand a 
cardboard template would be convenient. A check on II, is to recom- 
pute using inverted ordering in both dimensions. y may be rewritten 
in terms of the »'s by putting “va” for “pas,” etc., and replacing “1” in 


(22) by “2” 
In the 2X2 case we find that 
(25) укку! ашый 
pupa + pipa 


This is the same as Yule’s coefficient of association Q mentioned in 
Section 4. In this case y= +1 if any one cell is empty. For example, 


gives rise to y=1 always. 
Any case of the rollowing forms will give rise to y=1, since a con- 
flict in order is impossible: 


The right-hand table might be thought of as a case of “complete curvi- 

linear association.” › 

„~ Stuart [11], starting from а suggesticn by Kendall [6], has proposed 
a measure of association in the ordered case much like 7. Stuarts 

measure, which he calls z, is, in our notation, 


MEASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 751 
П, — Ila 
t= 
(m — 1)/m 


where m= Min (a, 8). The term (m—1)/m is introduced in order that 
т, may attain, or nearly attain, the absolute value 1 when the entire 
population lies in a longest diagonal of the table. Stuart develops his 
measure by considering a two-way ordered classification table as two 
rankings of a population, where many ties appear in one or both rank- 
ings as two individuals of the population fall in the same column or row 
or both. Then each ordered pair of individuals is assigned a score with 
respect to each ranking: 0 if there is a tie, or +1 as one or the other is 
ranked higher. Finally the product-moment correlation coefficient is 
formally computed with these scores, and the norming factor is intro- 
duced. н 

Thus, our development of у is seen to give another and more natural 
interpretation for the numerator of re: it is the probability of like order 
less the probability of unlike order when two individuals are chosen at, 
random. In addition the form in which т, is given above, together with 
(23) and (24), suggests a computation procedure somewhat different 
than that of [11]. 


63. An Ezample ү " 


Whelpton, Kaiser, and others [17] have investigated in great détail 
relationships between human fertility and a number 6f social and psy- 
chological characteristics of married couples. The analyses resulting 
from these investigations are replete with cross-classification tables, 
together with accompanying verbal explanations end recapitulations. 
Numerical indexes of association appear to have been used rarely, if at 
all, in this work. . . 

We wish to examine briefly one of «these cross-classification tables 
as an example of a cross-classification with an order in both classifica- 
tions. This examination should be construed neither as approval nor 
criticism of the methodology used in the studies edited by Whelpton 
and Kaiser, for this would not be appropriate here. (The reader may 
tefer to [18] and [19] for critical reviews.) However, we do feel that 
the use of summarizing indexes of association in а study of this kind 
May well be worth while for at least two reasons. One is that the 
Teader finds it' very difficult to obtain a bird's-eye view of the extensive 
numerical material without depending almost wholly on the author's 
own conclusions. Second, the use of indexes would mitigate the oriticisii Я 
that the author, consciously or not, selects from his numerical data 


752 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


those comparisons that are in line with his a priori beliefs. Needless to 
say, an index of association is recommended by these arguments only 
if it has some reasonable interpretation. 

The particular table we wish to consider follows, in terms of numbers 
of married couples. It refers to a rather special, but well defined, popu- 
lation: white Protestant married couples living in Indianapolis, mar- 
ried in 1927, 1928, or 1929, and so on. The data were obtained by strati- 
fied sampling, with strata based on numbers of live births. However, 
for present purposes we do not consider any questions of sampling, 
response error, specification of population, etc. The table is condensed 
from a more detailed cross-classification given in [17], vol. 2, pp. 286, 
389, and 402. Further, we shall not define the fertility-planning cate- 
gories that follow, but merely indicate the order. 


OROSS-OLASSIFICATION BETWEEN EDUCATIONAL LEVEL OF 
WIFE AND FERTILITY-PLANNING STATUS OF COUPLE. 
SOURCE [17], VOL. 2 NUMBERS IN BODY 
OF TABLE ARE FREQUENCIES 


c uL CODES Ыйы" зы  _  —— 
Fertility-planning status of couple 
A B с р 
Highest levei Most 
* of formal effective Least 
education planning У effective Row 
of wife of number planning | totals 
and spac- of children 
ing of 
children 
Ў Шен... 
| one yeer college | . А 
or more 102 35 68 34 239 
3 or 4 years high y 
school 191 80 215 122 608 
less than 3 years 
high school 110 90 168 23 591 
| rem 
Column totals 


This is clearly a case where there is relevant order in both classifi- | 
= caucus. We may first compute П, as follows (schematically) : 


MEASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 758 
2 

= (1438): 

+ 35(215 + 168 + 122 + 223) + - -- + 215 (223)] 


II, [102(80 + 90 + 215 + 168 + 122 + 223) 


2 
= —  —— [102 x 898 + 35 X 728 + - -- + 215 X 223 
das | x + 35 X 728 + + 215 x 223] 
2 x 311,632 
хо 

2,067,844 


This means that if we pick two couples at random from those included 
in the table, the probability is .301 that they are not tied in either 
classification and that they fall in the same order for both classifications 
(e.g., if educational level of wife is greater for first couple chosen, then 
effectiveness of fertility planning is also greater). 

Similarly we compute that Ma=.163. This is the probability of no 
ties and different orders. Finally II, the probability of a tie in at least 
one classification, is .536. Note that IT, -ITa-- TI, — 1.000. 

The conditional probability of like order, given no tie, is п,/(1—П,) 
=.301/.464=.649; and the conditional probability of unlike order is 
.163/.464.— 351. Clearly there is a greater chance of like order than of 
unlike order, and this means positive association, if the operational 
model is a reasonable one. To measure the magnitude of this association 
we may use y, which here is equal to F 

301— 163 . o 
464 Sax 
This is the difference between the conditional probabilities of like and 
unlike order, given no ties. avi 

It might be thought that one should look, not at the actual popula- 
tion above, but at a related population with equal row totals and with 
the same relative frequencies within each row. That is, we might wish 
to work with a derived population within which one-third of the wives 
lie in each educatfon category, but which is otherwise the same. This 
derived population is readily obtained (in terms of its pav) by dividing 
each frequency in the above table by three times the total in its row. 
Very minor adjustments were made because of rounding, in order that 
the over-all sum be 1.000. For the safne reason, the row totals are not 

ө 


exactly equal. By сг 


e д, й 
о 


754 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


CROSS-CLASSIFICATION BETWEEN EDUCATIONAL LEVEL OF 
WIFE AND FERTILITY-PLANNING STATUS OF COUPLE. DE- 
RIVED FROM PRIOR TABLE BY ADJUSTMENT TO MAKE ROW 
TOTALS EQUAL. NUMBERS IN BODY OF TABLE 
ARE RELATIVE FREQUENCIES (pa's). 


Fertility-planning &tatus of couple 
A B c eas 
Highest level Most 
of formal effective Least 
education planning effective Row 
of wife of number planning totals 
and spac- of children 
ing of 
children 
one year college 
or more 142 .049 .095 .047 .333 
8 or 4 years high 
school .105 .044 .118 -067 .334 
less than 3 years 
high school -062 -050 .095 .126 .333 
Column totals | .309 .143 .308 .240 1.000 


For this table we find П, — 325, П;= .170, II, — .505. 

Hence П,/(1—П,) =.657, II;/(1—IL,) —.343, and y=.314. There is 
no great difference between the original and the adjusted table in re- 
jns to association as measured by probabilities of like and unlike 
order. ° 

Alternatively, one might wish:to adjust the tabular entries so that 
column totals are equal, or one might attempt to adjust the entries 80 
that the row totals are equal and the column entries are equal. 


7. THE GENERATION OF MEASURES BY THE INTRODUCTION 
OF LOSS FUNCTIONS : 


7.1. Models Based on Loss Functions 


_ Instead of obtaining a measure as a natural function of probabilities 
in the context of a model of predictive behavior, one can more generally 

~ employ loss functions. In such a way, oné can even artificially generate 
the conventional measures described in Section 4. 


MEASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 755 
7.2. Loss Functions and the X Measures 


In the context of Section 5.1 let us suppose that in guessing an indi- 
vidual's B class one incurs a loss L(b;, bz), where By, is the true B class 
and B», is the guessed one. Consider first guessing B, given no informa- 
tion. Then a scheme of guessing Въ with probability рь(рь®0, Урь= 1) 
leads to an average Joss of paps р.ъ рь, L(by, b2). It is easily seen that 

9 Д 


this average is minimized by guessing that Bs, for which У) p.sL(b, ba) 
b 


is a minimum, or if there are two or more minima by guessing any one 
of them. Let bz be any one of these b;'s, so that the minimum average 
loss is У) р.ь L(b, bz). 

b 


On the other hand if the individual's A class is known to be Ag, 
the best scheme of guessing is to select b; to minimize У par L(b, ba). 


Let bza be such a minimizing bs; then the minimum average loss when 
A, is known is У) (pas/pa-) L(b, bza), and the over-all minimum aver- 
b 


age loss with A,’s known is У) У) pa L(b, bza). 
в b 


The decrease in loss as we pass from the first case to the second is 


therefore ә 


(26) È paL(b Ы) — У У ра), bra). x 

b a b е 
It would be reasonable to norm this by division by the first term, 
x a L(b, bz), to obtain a generalizatiOn of М. 


. 

Notice that if L(b,, ba) is 0 when b; 0; and Í when:b,z£b;, we obtain 
exactly às. Analogous procedures giveeus generalizations of i, and À. 
A slight extension of the procedure, permitting the loss to depend on 
the true A class as well as the tre and guessed B classes, gives a gen- 
eralization of A*. 


7.3. The Conventional Measures in Terms of Loss Functions 


Suppose, instead of predicting the classes of individuals, we are asked 
to determine the values pas when only the pa. and p.» are known. In the 
case of independence, these pas are pa. р.ь. In the more general case, the 
difference between pas and pa. p. may be thought of as the amount of 
error made by assuming independence, If the loss is proportional to 
the square of the error, inversely proportional to the езшйайе< ш: р. - 
and additive, we have ? e 


756 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 

(pav Tg Ра:р-љ)? 

QT). DIENST 
a b 


Pa-P-b 
where the k,/s are given constants. For comparison with stand 
chi-square, express this in terms of the vap’ 3 


Va-V-b 3 a 
Vab — ` 
v 


VaV.b 


Q8) . DIY ЭЯ 
a b 


and finally set k,,— v to obtain just the chi-square statistic. 
Although this procedure and loss function seem to us rather a 
ficial, they do give опе way of motivating the chi-square statistic 

_ Measure of association. 


8, RELIABILITY MODELS 
8.1. Generalities 


Consider now cases in which the classes are the same for the two р 
tomies, so that we deal with an а Ха table, but differ in that а 
ment to class depends on which of two methods of assignment is 
Thus we might for example consider two psychological tests both 
which classify deranged individuals as to the type of mental disord 
from which they suffer. Or again, we might consider two observ 
taking part in a sociological experiment wherein they independ 
and subjectively rate each child in a group of children on a five po 
scale for degree of cooperation. | 

One is often concerned,in such cases with the degree to whi 
two methods of assignnient to class agree with each other. In the 
of the psychological tests, for example, one of the tests might be а v 
established standard procedure and the other might be a more 
applied variant under consideration as a substitute. The psycholog 

- would probably only consider the variant seriously if it gave the 
answers as the standard test often enough in some sense which he v 
have to explicate. In the case of the two observers, the problem n 
be whether the kind of subjective ratings given by:trained obse 
in that context are similar enough to warrant the use of such subject 
ratings at all. 

As before we shall not consider here sampling problems, but Г 
shall suppose the population p,/s'known. The several distinctions 

- conventions of Sections 2 and 3 apply here of course, but the m 
suggested in Sections 5 and 6 do not seem appropriate in this relial 


MEASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 757 


context, One reason is that the classes are the same for both polytomies. 
This means that even in the unordered case we do not want a measure 
which is invariant under interchange of rows and interchange of col- 
umns unless the two interchanges are the same. 
An obvious measure of ‘reliability in such a study is just Z pus, 
LJ 


the probability of ‘agreement. However, we shall also consider some 
other possibilities, à 


8.2. A Measure of Reliability in the Unordered Case 


The measure we shall now propose might be appropriate under the 
following conditions: 


(i) Two polytomies are the same, but arise from different methods 
of assignment to class. 
(ii) No relevant underlying continua. 
(iii) No relevant ordering. 
(iv) Our interest in reliability is symmetrical as between the two 
polytomies. 


A modal class over both classifications is any Aa( =B.) such that 
Pa: P.a 2 ра". р.а, for all a’, It is simplest to suppose that there is a 
unique modal class, but if there are more any can be chosen. Denote by 
рм. and р.м the two marginal proportions corresponding to the gnodal 
class, М 

A modal class сап be given the following interpretation: choose an 
individual at random from the population and pick one of the two 
methods of assignment by flipping a fair goin. What is the long-run- 
best guess beforehand of how the chosen method will classify the chosen 
individual? The answer is: the modal class; and if the тода! class is 
4« then the probability of a correct guess is (р-р) = (рм. -Ер. м). 

In so far as there is good reliability between the two methods of as- 
Signment, one could make a better guess if one knew how the other 
method of assignment would classify the individual, and then followed 
the rule of guessing the same class for the method being predicted. 
The probability«of a correct guess would then be > foc, Thus аз we go 
from the no information situation to the other-method-known situa- 
tion, the probability of error decreases by Ура — (рм: -р-м). This | 
Quantity may vary from —} to 1— (1/a). It takes the value —} when 
all the diagonal р, are zero and the modal probability, pu. te. 
1s 1. It takes the value 1—@1/a) when the two methods always agree 
and each category js equi-prdbable.¢ 


758 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


To get a measure we should alter the above quantity, since a suf- 
ficiently large pa, for some a will make the above quantity low even 
though У реа is nearly 1. It seems reasonable to norm by division by 
the probability of error given noinformation, that is by 1—3(pu.+p.«). 
Hence we propose the measure 1 


Ура — #(рм. + ba) 
29 р ел EE, 
(29) à 1 = (рм. + р.м) 


This may be interpreted as the relative decrease in error probability as 
we go from the no information situation to the other-method-known 
situation. 

The measure à, can take values from —1 to 1. It takes the value -1 
when all the diagonal paa’s are zero and the modal probability, pu. p. 
is 1. It takes the value 1 when the two methods always agree. Àr is 
indeterminate only when both methods always give only one and the 
same class. In the case of independence A, assumes no particular value. 
This characteristic might be considered a disadvantage, but it seems 
to us that an index of this kind would only be used where there is 
known to be dependence between the methods, so that misbehavior 
of the index for independence is not important. 


8.3. Reliability in the Ordered Case 


For the case in which the classes are ordered, but a meaningful 
metric is absent, We have been unable to find a measure better than 
one of the following kind: 


(302) IP. D paa (as suggested in Section 8.1) 
> a-1 
(800  * 325. Pab 
la-N 1 , 
(30c) УО Йй»; 
job] $2 


that is, the only reasonable measures we know of are those that are 

based upon either the probability of agreement, the probability of 

agreement to within one neighboring class, two neighboring classes, 

and 80 on. If desired one could weight these probabilities when classi- 

‚ fication in a neighboring class is not as desirable as in the same class. 

Thus one might consider something like J pa +4 2р or its obvious 
ja—b|=1 


variants, These measures may also be justified easily by loss-function 
arguments. a 9 


^ 


o — 


MEASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 759 


9. PROPORTIONAL PREDICTION 7 


Instead of basing a measure of association on optimal prediction 
one might consider measures based upon a prediction method which 
reconstructs the population, in a sense to be described. The use of such 
a measure was suggested фо us by W. Allen Wallis. For simplicity, we 
restrict ourselves #0 the asymmetric situation of Section 5.1 where 
X, was constructed. Of course one could apply the same approach in 
other situations. 

Our model of activity, as before, is the following: An individual is 
chosen at random from the population and we are asked to guess his 
B class either (1) given no information or (2) given his A class. 

Optimal guessing will lead to a definite B class in case (1) and to a 
definite B class for each A class in case (2) (except that in the case of 
tied p./s or paps we have some choice). While such optimal guessing 
leads to the lowest average frequency of error, the resulting distribu- 
tion of guessed classes will usually be very different from the original 
distribution in the population. For some purposes this might be unde- 
sirable and one is led to the following model of activity: 


Case 1, Guess В, with probability р.з, Вг with probability р.з, * * * ; 
Bs with probability p.s. 
Case 2. Guess B, with probability pa/pa- (the conditional probabil- 
ity of Bı given Aa), Bs with probability pax/pa-) * * €; Bs 
` with probability pap/ Pa: . 


In each case the guessing is to proceed by throwing a f-sided die whose 
bth side appears with probability p.» (case 1) or pay pa. (case 2). This 
may be accomplished using a table of “random numbers.” If we make 
many such guesses independently it is plain that we Иа] approximately 
reconstruct the marginal distribution f the Bs (case 1) anf the joint 
distribution of the (Aa, Bs)’s (case 2): 

The long-run proportion of torrect predictions in case (1) will be 


a 


в. д 
x p.i, and in case (2) it will be 2) У) pas?/po.. Hence the relative 
- a=1 b-1 
decrease in the proportion of incorrect predictions as we go from 


case (1) to case (2) is 
E У раа. — x ра? 
a b 
Oo» 
° S СЕ 


which can be readily expressed in the chi-square-like form 


(oc, ts UR 


760 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 108 


\ SE (раь — Pa-P-b)” 
: a b Pa: 
(32) ть = TSHS , 
b 


It is clear that ть takes values between 0 aad 1; it is 0 if and only if 
there is independence, and 1 if and only if knowledge of A, completely 
determines By. Finally ть is indeterminate if and only if both independ- 
ence and determinism simultaneously hold, that is if all p..’s but one 0 
are zero. ? 


10. ASSOCIATION WITH A PARTICULAR CATEGORY 


A group of modifications of many of the preceding measures arises 
from the observation that there may be little association between the 
A and B polytomies in general, but if an individual is in a particular A 
class it may be easy to predict his B class. Suppose, then, that we want 
the association between Aa, a specific A class, and the B polytomy. 
One need only condense all the А, rows where a7 ao into a single row, 
thus obtaining a 2X8 table, and apply whatever measure of association 
is thought appropriate. The table will have this appearance. 


Bi В, ee В, 


Aa, Pagi m Рад. 


А, (aAa0) || ра рал Зб р:в — Рац 


We are indebted to L. L. Thurstone for pointing out to us the impor- | 
tance of this modification. 


11. PARTIAL ASSOCIATION 

When there are more than two polytomies it is natural to think of 
partial association between two of them with the effect of the others 
averaged out in some sense. Two such measures of partial association 
will be suggested here for the asymmetrical case and three poly tomies: 
The viewpoint will be that of optimal prediction. Analogous symmet- 
ricak measures may be readily obtained, and the restriction to three 
polytomies is purely for convenience of notation. The first two poly- 
tomies-"ill be denoted as before; the third will consist of the classifica- 
tion C1, Съ, - - - , Cy. The proportion of the population in As, Bs, 8? 


MEASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 761 


C, is pave) and dots will be used to denote marginal values in the con- 
ventional way. The proposed measures will be for partial association 
between the A and B polytomies ‘averaged’ over the C polytomy. (Do 
not confuse the integer y used here with the index y of Section 6.) 

LI 


11.1. Simple Average of % 

For fixed Ce, we*have а conditional AX В double polytomy with 
relative frequencies pay/p--e- Hence we can compute X, for each such 
table—call it №(с) to show dependence on с. Now it might seem natural 
to average these values with weights equal to the marginal relative 
frequencies of the C classifications. That is, we suggest 


(33) мА, BIO = 22 PNO: 
c=l 


11.2. Measure Based Directly on Probabilities of Error 


Tt seems to us somewhat better, from the viewpoint of interpreta- 
tion, to proceed as follows. For given C, if we predict B classes opti- 
mally on the basis of no further information, the probability of erroris 
1— (Max; p.) /0-.«; Whereas if we know the A class the probability of 
error is 1—( die Махь pato)/P-:o Hence, if we are given individuals 
from the population at random and always told their C class, the 
probability of error in optimal guessing if we know nothing more is 
1— У), Max; p..; whereas if we also know the A class the probability is 
1-5), DcaMaxs pase. Thus the relative decrease in probability of error 


is i 
E X Max poss  Z Max o 
34 X (A,.B Балп шд» Я 
( ) VC , |o 1 Y Max pu 
SACR ME Ta 


which might often be a satisfactory measure of partial association. 


12. MULTIPLE ASSOCIATION 


When there arè more than two polytomies one may well be interested 
in the multiple association between one of them and all the others. 
One simple way of handling this in the unordered case will be described 
here for three polytomies 4, B, and C as defined in Section 11. We sup- 
pose that the multiple association between A and B-together-with-C 
is of interest. Simply form’ a two-way table whose rows represent: the A 
polytomy and whose columns represent all combinations By, C. and 


762 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


then apply the appropriate two-polytomy measure. The table will have 
this appearance: 


B,C; | В.С | - Ше: В.С, | °- - | BC | +++ ВС, 
pui | биз |" | Puy | Ma | 77 7 Pizy | co | Piby 
pmi | paz | ^7 | Pay | Pon | °° | Peay 1770 Р2ву 
Pan | Pei |*** | Paty | Pam | *** | Paty | 177 | Paby 


Note that this procedure does not take the B XC association into ac- 
count, There is a rough analogy here with the motivation for the stand- 
ard multiple correlation coefficient of normal theory. The standard 
multiple correlation coefficient may be (and often is) motivated by de- 
fining it as the maximum correlation coefficient obtainable between а 
given Variate and linear combinations of the other variates. That is, itis 
a measure of association between a given variate and the best estimate 
(in a certain sense) of that variate based upon all the other variates. 
It is true that the standard multiple correlation coefficient may be ex- 
pressed as a function of the several ordinary bivariate correlation coef- 
ficients, but in a sense this is a consequence of the strong structural as- 
sumption of multivariate normality. 


13. SAMPLING PROBLEMS 


The discussion thus far has been in terms of known populations, 
whereas in practice one generally deals with a sample from an unknown 
population. One then asks, given a formal measure of association, how 
to estimate its value, how to test hypotheses about it, and so on. 

Exact sampling theory for estimators from cross-classification 
tables is difficult to work with. However, the asymptotic theory ш 
reasonably manageable, at least in’ some cases. We intend to discuss 
this in another paper, where we shall state some of the asymptotic 
distributions and say what we can of their value as approximations. 


MEASURES OF ASSOCIATION FOR CROSS CLASSIFICATIONS 763 
14. CONCLUDING REMARKS 


The aim of this paper has been to argue that measures of association 
should not be taken blindly from the handiest statistics textbook, but 
rather should be carefully constructed in a manner appropriate to the 
problem at hand. To emphasize and illustrate this argument we have 
described a number of such measures which we feel might be useful in 
several situations. While we naturally take a friendly view towards 
these measures, we can hardly claim that they are more than examples. 

This methodologically neutral position should not be carried to an 
extreme, It would be ridiculous to ask each empirical scientist in each 
separate study to forge afresh new statistical tools. The artist cannot 
paint many pictures if he must spend most of his time mixing pigments. 
Our belief is that each scientific area that has use for measures of asso- 
ciation should, after appropriate argument and trial,* settle down on 
those measures most useful for its needs. 


15, REFERENCES 


[1] Cramér, Harald, M. athematical Methods of Stalistics, Princeton, Princeton 
University Press (1946) 

[2] Fisher, R. A., Statistical M ethods for Research Workers, Tenth Edition, New 
York, Hafner Publishing Co. (1948) 

[8] Greenwood, Major, Jr., and Yule, G. Udny, “The statistics of anti-typhoid 
and anti-cholera inoculations, and the interpretation of such statigtics in 
general,” Proceedings of the Royal Society of Medicine, 8 [part 2] (1915), 118- 
94, 

[4] Guttman, Louis, “An outline of the statistical theory of prediction,” Supple- 
mentary Study B-1 (pp. 253-318) in Horst, Paul and others (editors), The 
Prediction of Personal Adjustment, Bulletin 48,, Social Science Research 
Council, New York (1941). е 2 

[5] Jahn, Julius A., “The measurement of ecological segregation: derivation of 
an index based on the criterion of reproducibility,” Americat Sociological 
Review, 15 (1950), 101-4. ° ® д 

16] Kendall, М. G., “Rank and product-moment correlation,” Biometrika, 36 
(1949), 177-93. 

[7] Kendall, Maurice G., The Advanced Theory of Statistics, London, Charles 
Griffin and Co., Ltd. (1948). A 

[8] McCormick, «Thomas C., “Toward causal analysis in the prediction of at- 
tributes,” American Sociological Review, 17 (1952), 35-44. t \ 

[9] Pearson, Karl, and Heron, David, “On theories of association," Biometrika, 
9 (1918), 159-315. Ил 

[10] Pompilj, G., “Osservazioni sull'omogamia: La trasformazione di Yule e il 
limite della trasformazione ricorrente di Gini," Rendiconti di Matematica e 


___limite della trasformazione ricorrente di OMI M m 


g 3 ERN 
* For examples of such argument and,rial in the field of sociology see J. 7. Wifliamsi3], Jahn [5], 
and McCormick [S]. ¢ é 
. 


764 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


delle sue Applicazioni, Università di Roma, Istituto Nazionale di Alta 
Matematica, Ser. V, Vol. 9 (1950), 367-88. 

[11] Stuart, A., *The estimation and comparison of strengths of association in 
contingency tables," Biometrika, 40 (1953), 105-10. 

[12] Williams, E. J., “Use of scores for the analysis of information in contingency 
tables,” Biometrika, 39 (1952), 274-89. 

[13] Williams, Josephine J., “Another commentary on so-called segregation in- 
dices," American Sociological Review, 13 (1948), 298—303. 

[14] Youden, W. J., “Index for rating diagnostic tests," Cancer, 3 (1950), 32-5. 
[15] Yule, G. Udny, “On the methods of measuring association between two 
attributes," Journal of the Royal Statistical Society, 75 (1912), 579-642. 

[16] Yule, G. Udny, and Kendall, M. G., An Introduction to the Theory of Sta- 

tistics, London, Charles Griffin and Co., Ltd. (1950) 
[17] Whelpton, P. K., and Kiser, Clyde V., Social and Psychological Factors Af- 
fecting Fertility, Milbank Memorial Fund, New York. 
Volume 1 (1946), The Household Survey in Indianapolis. 
Volume 2 (1950), The Intensive Study; Purpose, Scope, Methods, and Par- 
tial Results. 
Volume 8 (1952), Further Reports on Hypotheses in the Indianapolis 
Study. 
[18] Jaffe, A. J., review of [17], Volume 2, Journal of the American Statistical 
Association, 47 (1952), 348-9. 
[19] Lewis-Faning, E., review of [17], Volume 3, Journal of the American Statisti- 
cal Association, 49 (1954), 190-3, 


ұу 


А TEST OF GOODNESS OF FIT 


T. W. ANDERSON AND D. A. DARLING* 
r Columbia University and University of Michigan 


е Some (large sample) significance points are tabulated for 
a distribution-free test of goodness of fit which was introduced 
earlier by the authors. The test, which uses the actual ob- 
servations without grouping, is sensitive to discrepancies at the 
tails of the distribution rather than near the median. An il- 
lustration is given, using а numerical example used previously 
by Birnbaum in illustrating the Kolmogorov test. 


1. THE PROCEDURE 4 


mx problem of statistical inference considered here is to test the 
[Кеш that a sample has been drawn from a population with a 
specified continuous cumulative distribution function F(z). For example, 
the population may be specified by the hypothesis to be normal with 
mean 1 and variance 3; the corresponding cumulative distribution func- 
tion is 


(1) F(x) = y fem 


In practice the procedure really tests the hypothesis that the sample 
has been drawn from a population with a completely specified density 
function, since the cumulative distribution function is simply the integral 
of the density. 

The test procedure we have proposed earlier [1] is the following: Let 
її<ж< - - - Sz, be the n observations in the sample in order, and let 
u;= F(z;). Then compute ; ‹ 


(0) W:--a--3 G- ВПовш + log (1 — us) 
j=l e 

where the logarithms are the natural logarithms. If this number is too 
large, the hypothesis is to be rejected. à 

This procedure may be used if one wishes to reject the hypothesis 
whenever the true distribution differs materially from the hypothetical 
and especially when it differs in the tails. 

Significance points for W»? are not available for small sample sizes. The 
asymptotic significance points are given below: 


_* Work sponsored by Office of Ба йе Research, U. S. Air Force, Contract ‘AR 18(600)-442, 
Project No. R-345-20-7, в 1 
The authors wish to,acknowledge the assistancetof Vernon Johns in the computations. 


765 


766 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 
ASYMPTOTIC SIGNIFICANCE POINTS 


Significance Level Significance Point 
.10 1.933 
.05 2.492 
.01 3.857 


2. A NUMERICAL ILLUSTRATION 


Birnbaum [2] has considered a sample of 40 observations and applied 
the Kolmogorov statistic to test the hypothesis that the population from 
which the data came was normal with mean 1 and standard deviation 
1/4/0. By this test he found the data were consistent with the hypothesis. 
We have analyzed the same data using (2), obtaining W,— 1.158 which 
is well below the 10 per cent significance point, and we do not reject the 
hypothesis. 

‘The computation sheet for this calculation had the following columns: 
£j, V/6(2j—1), uj F(zj), 1— un- log uj log (1—u,;4:) and —flog , 
u;+log (1—un-;+1)]. The operation u;— F(z;) is simply finding the prob- 
ries to the left of /6(2;—1) according to the standard normal distri- 

ution. 

Another test procedure uses the Cramer-von Mises «° criterion given by 

> 
1 s 2j- 1\? 
3 pee Safad 
(8) пы a» =). 


The asymptotic distribution of this statistic is given in [1]. For Birn- 
baum's data we obtaiii т07= 1789, which is also well below the 10 per 
cent asymptotic significance point of .3473 
: In these two examples we have used the asymptotie percentage points 
instead of the actual ones based on finite sample size. Empirical study 
Suggests that the asymptotic value is feached very rapidly, and it ap- 
pears safe to use the asymptotic value for a sample size as large as 40. 

Application to the same data of the usual x? criterion of K. Pearson, 
using 8 categories each with expected frequency 5, shows that x°=6.4 
Which with 7 degrees of freedom is not significant at the 10 per cent level 


3. DERIVATION OF THE CRITERION 


_ Several test procedures are based on comparing the specified cumula- 
tive distribution function F(z) with*its sample analogue, the empirical 
cumulative distribution function i 


* Г 


A TEST OF GOODNESS OF FIT 767 


.ofz; 5 
jj (Ay eae = 25 


The present writers suggested [1] the use of the criterion 


@ We» f tro - FO) VF enar) 


where y(u) is some nonnegative weight function chosen by the experi- 
menter to accentuate the values of F,(z) —F(x) where the test is desired 
to have sensitivity. The hypothesis is to be rejected if W, is sufficiently 
large. When y(u) 21 this criterion is n times the «w° criterion. 

The criterion W,? is an average of the squared discrepancy 
[F.(z) —F(z)], weighted by V[F(z)] and the increase in F(x) (and the 
normalization т). If one wishes the test to have good power against 
alternatives in which H(z), the true distribution, and F(z) disagree near 
the tails of F(z), and to this end is willing to sacrifice power against 
alternatives in which H(z) and F(x) disagree near the median of Р(х), 
it seems that one ought to choose (и) to be large for u near 0 and 1, 
and small near u=4. Even if the alternative hypotheses are closely 
delineated, however, it appears difficult to find an “optimum” weight 
function y(u). For a discussion of the general nature of power of distribu- 
tion-free tests, see, for example, Birnbaum [3] and Lehmann [4]. о 

For a given value of =, F,(z) is a binomial variable; it is distributed 
in the same way as the proportion of successes in m trials, where the 
probability of success is H(z). Thus, Е[Р.(2)]=Н (x) and 


(5) пЕ[Р„(ж) — F()] = nE[Fa(a) — НОЈ + n[F(2) — H@)} 
= H(x)[1 — HG] + nlF(e) — Bol. 


Under the null hypothesis (H(z)—F(z)), the variance is F(z)[1 —F(2)]. 
Tn a sense, we would equalize the sampling error over the entire range 
of z by weighting the deviation by the reciprocal of the standard devi- 
ation under the null hypothesis, that is, by using 


(6) NEN 
y(u) aa ia 
as a weight function. This function has the effect of weighting the tails 
heavily since this function is large néar u=0 and v=1. It is this weight 
function (6) which we treat in the present note. rade: 
Formula (2) is gbtained by writings(4) as 


768 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


= 
- Fai - ral 
^ PG 


д 
| one rf 


3 
n 


dF (a) 
n 


[Р„(ж) — FG)? 
F.G)[t — F@)] 


© Еу 2 
+--+ fi b= FO! а 


dF (x) 


Fl — re) 


and letting F(x) —u(dF(z) = du). Straightforward integration and collec- 
tion of terms gives (2). The formula (2.5) in [1] cannot be used directly 
1 


here, for that formula requires that f V(u)du < =, which is not true 
0 
“of (6). 


4. COMPUTATION OF THE ASYMPTOTIC SIGNIFICANCE POINTS 


It was proved in [1] that the limiting characteristic function of W,” 
defined in either (2) or (4) is 


(7) $() = lim Eet’) = ITA 7 
us cos (zv 1+ wi) 


and tliat the inversion of this characteristic function gave for the limiting 
cumulative distribution of W,? the expression 


2 (— 1G + : 
(8) Voy C DTG + NG + 1) gd nte] (зг) 
Z je POM by! 
( © pelt (EDILA r u" вә dw: 
v0 


The terms of this series alternate in sign and the (j--1)st term is less 
than the jth term, 21; thus the error involved in using only j terms of 
this series is less than the (j+-1)st term for /2:0. By using the fact that 
e olent. one can easily verify that to conipute the prob- 
abilities correctly to four decimal places, one needs only the 0-th term 
for the first two significance points and the 0-th and 1-86 terms for the 
third significance point. The laborious part of the computation is the 
evaluation of the integral. Let [(2j+1)x/2V/2)lw=y; then the inte- 
grand is,f(j)«3v^. The y-axis was divided into intervals according to the 
integral e and numerical integration wis performed. 


A TEST OF GOODNESS OF FIT 769 


The moments of the asymptotic distribution are fairly easy to obtain 
from formulas given in [1]. The first two are 


E 1 


lim E(W,?) = E(W,?) = 3; ———- = 1, 
c m= Ern 
ce 1 2 
im Var (W,5. = L———-2—(n-9-. К 
lim ar (Wp 2l AGIT» 5 (т? — 9) ~ .57974. 


The asymptotic significance points are computed to assure the prob- 
abilities (significance levels) to be correct to four decimal places. 


REFERENCES 


[1] Anderson, T. W., and Darling, D. A., “Asymptotic theory of certain ‘good- 
ness of fit’ criteria based on stochastic processes,” Annals of Mathematical 
Statistics, 23 (1952), 193-212. 

[2] Birnbaum, Z. W., *Numerical tabulation of the distribution of Kolmogoroy’s 
statistic for finite sample size,” Journal of the American Statistical Associa- 
lion, 47 (1952), 425-41. 

[3] Birnbaum, Z. W., “Distribution-free tests of fit for continuous distribution 
functions," Annals of Mathematical Statistics, 24 (1953), 1-8. $ 

[4] Lehmann, Е. L., “The power of rank tests,” Annals of Mathematical Statis- 
tics, 24 (1953), 23-43. 


UNIVARIATE TWO-POPULATION DISTRIBUTION-FREE 
DISCRIMINATION 


Davin 8. STOLLER 
University of California at Los Angeles* 


A distribution-free procedure for classifying a univariate 
random variable, z, into one of two populations on the basis 
of a sample of size N, in which m members are classified into 
one population and the remaining (N—m) into the other, is 
given as follows: Let t(z) = k(z) —h(z), where k(z) is the num- 
ber of observations from the first population*%which are less 
than z and (г) is similarly defined for the second population. 
If z St*, where {* is that value of z for which t(z) is a maxi- 
mum, classify z into the first population, otherwise into the 
second. The probability of correct classification, and its esti- 
mate, [V—m-+t(¢*)]/N, both converge in probability to the 
maximum attainable probability of correct classification. 


1, INTRODUCTION 


ню following example typifies a discrimination problem which is of 
j| eaten A group of N students take an aptitude test for a certain 
course, and receive scores 21, - - - , zy. At the end of the course, all stu- 
dents are classified into two groups, “superior” and “inferior,” on the 
basis of final gřades or any other criteria. Another student takes the 
aptitude test and gets a score гуу. It is desired to classify this student 
as “superior” or “inferior.” This will be done on the basis of selecting a 
discriminating score, ¢*, based on the previous № scores, and classifying 
the student as “superior” if 2y4i1>{* and “average” if zwi (*. 

When there is complete’ a priori knowledge about the distribution 
functions and the relative frequency of occurrence of the two groups, 
Hoel and Peterson [6] have shown’how to find an optimum discriminating 
point, ¢, by maximizing the probability of correctly classifying 2м+л. If 
the relative frequency of the two groups is not known, but there is other- 
wise complete knowledge, Anderson [1] and Welch [9] have shown how 
to find an optimum discriminating point by a minimax procedure. 

Tf, further, there exists only partial a priori knowledge about the prob- 
ability distributions, specifically, if the functional forms of the distribu- 
tions are known but the parameters and relative frequencies are unknown, 
then, under certain restrictions, estimating the optimum discriminating 
рош by replacing unknown parameters with their maximum likelihood 
estimates is an asymptotically optimum procedure [6]. 


* Now at The Rand Corporation. ° ? ^ 


770 


ЕЗ 


UNIVARIATE TWO-POPULATION DISTRIBUTION-FREE DISCRIMINATION 771 


A distribution-free procedure for the case where there exists no a priori 
knowledge of the parameters or form of the distribution functions has 
been investigated by Fix and Hodges [4], whereby zw. is classified into 
one group or another according to whether the sample values, z;, “closest” 
to 2+1 are mostly in one or another group. In [4], consistency properties 
are proved about the próbability of misclassification induced by this 
procedure. The smalisample behavior for some special cases is considered 
in a later paper [5]. 

The present paper proposes a distribution-free procedure for the uni- 
variate two-population case, together with an estimate of the probability 
of correct classification. It is shown that (1) the estimate of the probability 
of correct classification is a consistent estimate of the optimum prob- 
ability of correct classification, (2) the probability of correct classification 
induced by this procedure converges in probability to the optimum prob- 
ability of correct classification. 


2. STATEMENT OF PROBLEM 


Let II be а composite univariate population, in which IT; and П, are 
sub-populations with cumulative distribution functions F,(z) and F(z). 
Let 0 be the probability that z, a random member of II, is a member of 
Th, ie. z is a random, variable defined by the cumulative distribution. 
function, 0F 1(z)+(1—0)F2(2). н . Ў 
А random sample, 21, - · - zx, is taken from П, in which m of&hez; / 
are identifiable as members of I, and the remaining № —m as members „/ 
of Hz, Another sample value, 21, which is unidentifiable, is obtained абу ~ 
random. Without a priori knowledge af 0 or of the functional forms or 
parameters of F;(z) and F(z), itis desired бов — , | 


° \ 
(1) Classify гуз as a member of II; or Пз. 2 h a 
(2) Estimate the probability that 21 has been correctly diassified. \ 


The functions F;(z) and F (2), however, will be restricted to be such 
that (1) F,(z), F2(z) are absolutely continuous, so that the probability of 
tied sample values is zero), and (2) the optimum discrimination scheme 
consists of classifying 2x41 according to whether 2541<{ or аты> $, 
where t is a unique point. An optimum discrimination scheme is here 
defined as one that maximizes the probability of correct classification. 

The probability of correct classification when an arbitrary point, z, 18 
used to classify гу by the above rule is given by: Ы 


Q(z) = ӨЕМӘ + q- ep — ro]. 
By definition, Q() achieves its fnaximpm at z=. 


772 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 
3. A DISTRIBUTION-FREE DISCRIMINATION PROCEDURE 


An estimate of the discriminating point, t, will be made by first con- 
sidering a distribution-free estimate of the function, Q(z). This estimate 
is formed by replacing 0 by m/N, and F(z) and Р,(2) by the appropriate 
step functions, as follows. н 

Let xı, · · * , tm be the sample members from Ij, ordered by magnitude, 


and y; * * * , YN-m the similarly ordered members from M. Define the 
Step functions: 
8.0 (2) = k/m, xx S2 «zu; k=0,-++,m, 
Sy—m(@) = ММ – т, yy Sz «yum; h-50,--.,N—m, 
where 
T= Yo = —90; Хт = YN-m = doo. 


Then a distribution-free estimate of Q(z) is defined by 
00) = (m/N)Sm() + [IN — m)/N][1 — Sv-m(2)] 


2 iw k—h 
т ао @- )]. 


(Note that 0 Q(z) S1.) Take any value of z that maximizes Q(z), say 
z= {*. Then {* is an estimate of the optimum classification point, t, and 
@({* is an estimate of the probability of correct classification of the 
scheme: Classify 2,1 as a member of IT; if zy41 < t*, otherwise as a mem- 
ber of П, 

‚ 4. ILLUSTRATION 


To illustrate the procudure, consider the following example. A class of 
25 beginning algebra students took a test on elementary algebraic opera- 
tions after two weeks of instruction. At the end of the course, the instruc- 
tor classified the 25 students into two groups: “inferior” and “superior.” 
Ranking the students by means of their two-week test scores, the follow- 
ing ordering resulted, where the scores of the 8 “superior” students are 
indicated by italics: 

50, 51, 573, 63, 64, 68, 72, 73, 74, 743, 75, 151, 76, 
763, 77, 78, 81, 82, 84,.85, 86, 873, 89, 91, 92. 
To this corresponds the following ordering of the z's and y’s: 
Erie Vi, Va, Ts, Ta, Xs, Wo, Tr, Ts, 51, Yo, Yo, Tro, Уз, 
Tu, 212, Tia, Zu, Lis, Ys, Ys, Ys, 216, Ут, Ув, Tır- 


кутт 


UNIVARIATE TWO-POPULATION DISTRIBUTION-FREE DISCRIMINATION 773 


Notice that (k—h) may be computed very rapidly by the following 
procedure. At zo, (k—h)=0. Proceeding along the ordered 2’s and y’s, 
add one whenever an т is encountered, subtract one whenever а y is en- 
countered. In the example, it can be seen that (k—h) attains a maximum 
(of 12) in the interval from 29 to ys, where 82 S 2< 84. Therefore, any 
point in this interval, say (*=83, yields a discriminating procedure, An 
estimate of the probability of correct classification induced by ¢*=83 is 
(83) =20/25, where the numerator is equal to (N —m) plus the maxi- 
mum value of (k—h) and the denominator is N. 

When applying the procedure described above, it may happen that 
two or more intervals exist in which (k—1) is maximized. It is subse- 
quently proved that any point which maximizes (k—h) possesses asymp- 
totically optimum characteristics. From a large sample point of view, 
therefore, when more than one maximizing interval is encountered, any 
point from any one of the maximizing intervals may be selected as {*. 
In small samples, as a practical consideration, one should select an 
“average” value of ¢*, say the average value of the midpoints of 
the maximizing intervals. The “average” value of ¢*, as compared to 
any one value of ¢*, will possess a more stable sampling behavior for small 
sample sizes, and will have the same asymptotically optimum character- 
istics discussed subsequently. 

It may also happen that tied sample values will occu», due either to 
rounding off of observations or to a priori discretenéss of the populatéons. 
In that category of ties which does not affect the calculation of (*, the 
tied sample values may be ranked arbitrarily. When the caleulation of 
{* is affected by the occurrence of one or more sets of ties due to rounding 
off of observations, each critical set of ties im each population may be 
distributed uniformly over the round-off range ot the observations, and 
then ordered accordingly, For example, in the sample: * 


1, 1, 1,2, 3,454, 4 4 4) 5 б, ?, 
* 


the tied sample values (1, 1, 1), may be ranked arbitrarily since, by 
inspection, this does not affect the value of i*. The set, (4, 4, 4, 4, 4); 
which does affect the calculation of ¢*, contains two observations from 
IL, which when distributed uniformly over the round-off range, 37 to 
43, are assigned the values, (38, 44). Likewise, the set, (4, 4, 4), is 
assigned the values, (33, 4, 44). bu И 

When the populations are a priori discrete, and critical sets of tied 
values occur, k(z) may be redefined ag the number of observations from 
IT, less than or equal to z, and h(z) аз the number of observatiors from 
II, less than or equal to z. Then k(2) — (2) will be uniquely defined even 


774 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 194 


when ties occur. In the example above, treating the sample values as 
a priori discrete, k(1) —h(1) — 1, k(4) —h(4) =2, and k(z) —A(z) is max- 
imized in the interval, 8 <2<4. 

In Section 1, the criterion for determining an optimum classification 
scheme is based on maximizing the probability of correctly classifying 
one observation whose population membership is unknown. It should 
be noted that a classification scheme which is optimum in this sense 
for one unidentified observation is also optimum for a group of r inde- 
pendent, unidentified observations. For, if p represents the probability 
that one observation is correctly classified, then р” is the probability 
that the entire group is correctly classified, and the latter probability 
is maximized when p is maximized. 


5, DISTRIBUTION OF Q(z) 


For fixed m, h and k are independent binomial variates of means 
mF'(z) and (N—m)F;(z) respectively. Also, m is a binomial variate of 
mean N9, therefore it can be shown (by use of conditional expectation) 
that, for each z, 


ERO] = 0@ 


Thus, for each z, Q(z) is an unbiased estimate of Q(z). It can also be 
shown that „> ; 


o 


Var [Q(2)] = c/N, 


where c is a constant depending on 6, Fs(z), F2(z), but not on N. 
Since 


o ' lim o = 0, 
Q, , 
$ Wes e D 


by a Tehebychev inequality (see Cramér [2], Theorem 20.4) it is seen 
that, for each z, Q(z) is also a consistent estimate of Q(z). 
6, SOME PROPERTIES OF THE CLASSIFICATION PROBABILITY 
ESTIMATE, Q({*) 


‘It is readily seen that Q(t*), the estimate of the classification prob- 
ability induced by the point estimate, t*, is non-negatively biased, 
since Q(t*) > Q(¢), and thus 


| ER — O@)] > о, 


from which, > 


E[Q(*)] ZO. 


UNIVARIATE TWO-POPULATION DISTRIBUTION-FREE DISCRIMINATION 775 


An example (suggested by a referee) of the magnitude of the bias 
for small samples is given by the special case: 0— 4; Fi(z) =F.(z). Here 
Q(t) = 2, but, by definition, Q(r*) z 3. Further, if т= №/2, 


QG) = 4+ 4 max (b/m — h/m) 


е 
and 2 n 


Pr {00° — 00 > 8} = 
Pr {1069 – Q@)| > 8} = 
Pr {max |k/m — h/m| > 28}, 


the latter probability having been tabulated by Massey [8]. For two 
equal samples of size 10 from the same population, Pr{Q(¢*)=.7} 
equals about .16, and Pr(Q(t*) >.75} equals about .05. 

However, Q(t*) is a consistent estimate of Q(t), for given e n>0, 
by sec. 5, for sufficiently large N, 


Pr {00 < Q@) - 9 < 
and since Q(¢*) 2000, 
Pr {009 <00- 6 <". 


Now, 

—— eh ome 
= max (0, - = orf s fJ- = 
+max {тч E -B-a- one 


< le -*| + max [z- ne) 


: h 
— — (1 — OF o} $ 
ш i en 
Since ЛУ is a consistent estimaje of 0, for N 2 N'(e, т), the în- 
equality, e * oc 


- pr (10 [> d e» 


776 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


is satisfied. Now, 


mus g — mo} 


А k 
< max E =o ыа! + max {o —- er 
КЕ eom 


z m 


s 


k Ў 
—- Еа) 
m 


m o| + 
— – шах 
N z 


Consider the expression, 
max | k/m — Fi(z) |, 


with т temporarily held fixed. Now 
Мт max | k/m — FC) | 
г 
is well known (Kolmogorov [7], Feller [3]) to possess an asymptotic 


probability distribution function with finite mean and variance. There- 
fore, for any fixed т> M (e, 7/2), 


k 
Pr { max — — Fiz) 
2 im 


> + 25 n/2. 
Now for N> N"(M), Pr{m<M} <n/2, therefore for N >N” (M), 


k 
Pr { max — — F(z) 
z m 


= j <n. 
A similar N” (e, 1) exists for 


3 


plc 


thus for N » max (N^, N”, N^), 
Pr {б ©) > Q(t) + 5e} < v. 
7. AN ABYMPTOTICALLY OPTIMUM PROPERTY OF THE DISCRIMINATING 
POINT ESTIMATE, {* 

Н It can also be shown that Q(t*), the actual (but unknown) classifica- 
tion probability resulting from the use of the discriminating point 
estimate, {*, converges in probability to Q(t), the optimum classifica- 
tion probability, i.e. for sufficiently large sample sizes, Q(*) is arbi- 
{тагу olose to Q(¢) with probability artitrarily close to one. For, 


[989 — 001 = |069) — O64) | +] Q5 – ew. 


UNIVARIATE TWO-POPULATION DISTRIBUTION-FREE DISCRIMINATION 777 


By вес. 6, for N»N(e 1), 
Pr {1009 - 00] > 9 «€» 
Now, 


[осе - el E = B | m [orem ex 


h 
tla- or - x 
Examining the second term, 
ko ok | 


= | - |+ ay 


k 
OF s(5*) — W 


k 6 1 
F,(2) -=| + "| n W 


< 0 max 
z m 


n@-=|+|e-% 


< max 
LI 


Consequently, in a manner similar to the discussion of sec. 6, it can be 
shown that there exists ап N(¢, n) such that.for N»Nt(e 0); 
° 


Pr {|0(@*®) — UE| > e < ў 


REFERENCES 


[1] Anderson, T. W., “Classification by multivariate analysis," Psychometrika, 16 
(1951), 31-49. EA уы 

[2] Cramér, H., Mathematical Methods of Statistics, Pyimceton, N. J., Princeton 
University Press (1946), p. 253. bd 

[3] Feller, W., “On the Kolmogorov-Smirnev limit theorems for empirical dis- 
tributions,” Annals of M athematicgl Stafistics, 19 (1948), 177-89. 

[4] Fix, Evelyn, and Hodges, J. L., Discriminatory Analysis; Non-parametric 
Discrimination; Consistency Propériies, USAF School of Aviation Medicine, 
Randolph Field, Texas, Report Number 4, February, 1951. 

[5] Fix, Evelyn, and Hodges, J. L., Discriminatory Analysis; Non-parametric Dis- 
crimination; Small Sample Performance, USAF School of Aviation Medicine, 
Randolph Field, Texas, Report Number 11, August, 1951. 

[6] Hoel, P. G., and Peterson, R. P., “А solution to the problem of optimum classi- 
fication,” Annals of Mathematical Statistics, 20 (1949), 433-8. 

Ш Kolmogoroy, A. N., “Sulla determinazione empirica delle leggi di probabilita, » 
Giornale dell’ Institute Italiano degli Attuari, 4 (1933), 1-11. . 

[8] Massey, Frank J., “The distribution ofthe maximum deviation between two 
sample cumulative step functions, "e Annals of Mathematical Statjstics, 22 

е 


(1951), 125-8. à 
[9] Welch, B. L., “Note on discriminant funttions,” Biometrika, 31 (1939), 218-20. 


USE OF NORMAL PROBABILITY PAPER 


Herman CHERNOFF AND GERALD J. LIEBERMAN 
Stanford University* 


Normal probability paper is sc designed that the cumulative 
distribution function of a normally distributed chance variable 
appears as a straight line. It is a common practice to plot the 
observations of a sample on this paper to obtain a graphical 
check for normality or to obtain a graphical estimate of the 
mean and variance of the population. Textbooks, however, 
are not very specific about methods for plotting, for, although 
the ordered observations are plotted along the abscissa, some 
uncertainties about the corresponding ordinates are left un- 
resolved. The purpose of this paper is to indicate, with a spe- 
cial example, that any graphical technique should depend to a 
large extent on the purpose for which the graph is drawn. In 
particular, it presents tables covering sample sizes up to 10, 
for selecting the ordinates on normal probability paper so 88 
to obtain “optimum” graphical estimates of the mean £ and 
the standard deviation с of a normal distribution. The some- 
what more complicated problem of selecting the ordinates to 
obtain an “optimum” test for normality is not discussed. 


1, UNBIASED ESTIMATES OF £ AND с 


Y, MEANS of a non-linear transformation of the vertical scale on the 

graph of the cumulative-normal-distribution curve, it is possible 
to transform this curve to a straight line. Graph paper possessing this 
property is known as normal probability paper. The abscissa scale 
corresponds to the values,of a normally distributed chance variable, 
whereas the ordinate séale represents a number, p, between 0 and 1. 
Neither 0,nor 1 appears on the ordinate scale. 

If a sample of n independent Observations is to be plotted on normal 
probability paper, it is natural to arrange them in ascending order, 
le, wSu2S,--- Su, and to plot a point corresponding to each 
observation. One such plot is (ш, 1/n), (us, 2/n), - -  , (us / n). 
However, it is evident that the last point does not appear on the graph. 
Furthermore, the symmetry of the normal distribution suggests that 
t; and и, be treated in a “symmetric” fashion. Two alternative plots are 


° (s E ) (u >) ( a 
nti)’ Vnt 4 7) 


7 


LX 
* Work sponsored by the Office of Naval Research, ? 


778 


USE OF NORMAL PROBABILITY PAPER 779 


(2) (m2). 9-6 


Since there is no obvious,rationale for preferring one of these plots to 
the other, there arises the problem of selecting an “optimum” method 
of plotting. 

Let us consider the problem of graphically estimating the mean £ and 
standard deviation е of a normal population on the basis of a sample. 
Once the points (и, р), (us; Pa), * * * » (Un p») are plotted, a method 
which suggests itself is to fit a straight line visually to the points and to 
take the abscisa where the line intersects p —.5 as an estimate of the mean, 
and the distance between the abscissas where the line intersects p= 8413 
and p=.5 as an estimate of the standard deviation. Let us assume that 
the visually fitted line is a very good approximation to the line that would be 
obtained by minimizing the sum of squares of the horizontal deviations from 
the line.’ The problem then is to find what values of pi, ps * * * ; D» 
yield good estimates, Ẹ and ô, of the mean £ and standard deviation c of 
the normal population sampled. Since p is not represented on a linear 
scale, we shall transform to v=v(p) which is related to p by 


and 


v 1 . 
(1) = f —— e*""dx, ° 
= ° Й 
In terms of the ordinate v, the fitted straight line may be represented by 
@) u=Ft+o * 


* 
where £ and ё are the estimates of £ and c. If v,-v(p), i=1, 2,- -,, 
these estimates are x " 


(3) 2 sd 3j ô 
and 

t= s (и; — ü) (v: — v)/ S (v; — 5)? б 
(4) i-i ii 


bue У ан AT ES E Н 
in Ў : 


‘Th orisoni RP * 
M Бае tal deviations is suggested by the fact that the pj are not chance variables snd 
E 


780 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMB 


It we require that Ẹ be unbiased and 2 Z0, we must have ï=0. In that c 
£- which is the mean of the sample. In fact, it is the optimum estin 
of £ from many points of view. 

The estimate of с may be represented by a linear function of the 
dered observations из, us, * * * , Un, 1.6., 


n 


(5) & = У cm 
ml 
where 
(6) Oy dar i oe Oe Ae 
У (i — 9)? 


iwl 


suitably varying the ог. The reason for this is that the only condii 1 
imposed by equation (3) on the c; is Y^,..,"c;— 0. This condition is implit 
by unbiasedness since 


" n 
(7) к( X eu) = ke + ( УЗ 2: 
LU / tel 
where k depends ÒN €i, C2, * + +, ^s. It follows that the problem of find 
the optimum v, is equivalent to that of finding the minimum vari 
unbiased estimate of о which is linear in the ordered observations. 
This problem is one which was treated by Godwin [1]. He pres 
table of coefficients to be used with the ordered observations to ob in 
minimum variance unbiased estimate of c. His results were transform 
for use in this paper. The “optimum” values of p; can be found in Tab 


A peculiarity of the problem of estimating о is that by in in 
; y introduci 

slight bias we are able to obtain а' better estimate in the sense of m 
mizing the mean square deviation (rom ø, 7 
Suppose t is an estimate of 0 where H(t) =0, Consider the statist 


USE OF NORMAL PROBABILITY PAPER 781 


for which а is defined such that E((at—6):] is a minimum. In other 
words, we minimize the expression 


B{ [(at — а) + (a — 1)6]?} = ae + (a — 1)*6* 
with respect, to a, where c? is the variance of ¢. This minimum occurs at 
& ҮҮ? e 
G за ——Ó 
6+ о? 


It then follows that 


B{ (at — 09] = 


In the problem under consideration, tis a linear function of the ordered 
observations whose coefficients add up to zero. Hence c? (for the given 
sample size) is a multiple of c*, say, kno?. Furthermore, 0— c, and there- 
fore, the optimum a is given by 

1 
a= . 
1+ kn 


The corresponding mean square deviation about фіз given by 


E|(at — о)?) = 


41 + 1 
A. 
Consider two unbiased estimates, linear in the ordered observations, 
which have variances k,o? and k,*o? where kn<kn*. The estimate with 
variance kno? yields the better biased estimate (of the above type) since 


Laie 1 


1 1 
ЕЕЕ : 


Hence multiplying the c; contained in the solution of the problem in the 
preceding section by an appropriate factor, which is equivalent to di- 
viding each v; by a, leads to the estimate which is optimum in the follow- 
ing sense, Among all estimates which are linear in the ordered observa- 
tions and whose bias is indepéndente of ¢, it has minimum пеар square 
deviation from о. .* ý 


782 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 194 ^ 
| 
| 
3. TABLES 


In Table I the р; values corresponding to the following are presented: 


1. the best linear (in the ordered observations) unbiased estimate of c, 
2. the best linear (in the ordered observations) biased estimate of c, 


seats 


nile 


4. Pi = 


2 


for values of n=1(1)10. 
TABLE I 


COMPARISON OF THE ORDINATES (p) USED ON NORMAL 
PROBABILITY PAPER FOR ESTIMATING THE 
MEAN AND STANDARD DEVIATION 


USE OF NORMAL PROBABILITY PAPER 783 


nm Pa 


7 Ш .080456 .24824 


[2] .063732 | .22986 : 
[8] .12500 Р ; 
[4] .071429 35714 | .50000 
—————E—— peee | 
8 n .069624 33500 | .44544 
[2] .056027 32348 | .44139 
[3] 11111 33333 | .44444 
[4] .062500 31250 | .43750 
9 n .060607 30146 40162 50000 
[2] 049425 28078 | .39537 50000 
[3] .10000 30000 | .40000 50000 
[4] .055556 27778 | .38889 50000 
10 Ш .053568 27386 36559 45537 
[2] .044192 26245 | .35818 45281 
[3] .090909 27273 | .36364 45455 
[4] .050000 25000 | .35000 45000 


Norn: When i »n/2, use pj =1 —pa-ii. 

[1] These values of p; correspond to the ordinates that yield the minimum variance unbiased 
estimate of ø which is linear in the ordered observations: 

[2] These values of p; correspond to the ordinates that yield the biased estimate of v which has 
minimum mean square deviation from е and which is linear infhe ordered observationg. 

[3] These values of p; correspond to #/(л-Е1). 

[4] These values of pg correspond to (i —4) /n« 

Table II presents the mean square deviations from о of the estimates 
which are linear in the ordered observations, the variance of the minimum 
variance non-linear unbiased estimate of о, and the fnean square deviation 
from о of the non-linear biased estimate having minimum mean square 
deviation, The minimum variance non-finear unbiased estimate of c[1] is 


*Ge-3) ( -if) TTE | 
E 


and the corresponding biased estimate is 


С "Ger @-#)|. +. | 
ређи 


784 AMERICAN STATISTICAL ASSOCIATION чока; DECEMBER 


This latter estimate is optimum in the sense that among all estin a 
having an expected value which is a multiple of с and a variance proj р 
tional to o°, it has minimum mean square deviation from c. This esti 


TABLE II 


COMPARISON OF THE MEAN SQUARE DEVIATIONS 
FROM c OF VARIOUS ESTIMATES OF c 


n Ш [2] [3] I4] [5] [6] 
2 .57080 .57084 .36338 .36340 1.07533 .42011 — 
3 .27324 .27549 .21460 .21599 .49856 .22649 
4 ‚17810 - 18006 .15117 .15259 ‚81559 . 15558 
5 «13177 -13332 ‚11643 ‚11764 .22751 .11872 - 
6 ‚10447 10571 .09459 ‚09560 ‚17630 .09605 
T ‚08650 .08714 .07961 .08015 ‚143806 .08067 
8 .07379 -07469 .06872 .06950 .11987 . 0695: 
9 .06432 -06501 -06044 -06105 . 10283 06111 
10 .05701 .05759 .05393 .05445 .08981 .05449 


[1] Variance of the niinimum variance non-linear unbiased estimate of о. 

[2] Variance of the minimum variance unbiased estimate which is linear in the ordered o 
tions, 

[8] Mean square deviation from о of the non-linear biased estimate which has minimum m 
square deviation. 4 e 

14] Mean square deviation from c of the biased estimate which is linear in the ordered observati 
‘and which has minimum mean square deviation, ў 

[5] Mean square deviation from с of the biased estimate based upon the ordinates ¢/(n +). 

[6] Mean square deviation from о of the biased estimate based upon the ordinates (i—4)/m 


is referred to above and subsequently as the non-linear biased estims 
having minimum mean cquare deviation from c. у 

_In Table III these mean square deviations are transformed to efficien-_ 
cies. For the unbiased estimates the ratio of the variances are compute 
for the biased estimates the ratio of the mean square deviations from 
to the minimum are computed. 

Tt is evident from Tables II and III that the optimum choice of the p 
depends upon whether an unbiased estimate is necessary or whether 
biased estimate can be tolerated. In either case, the graphical estimates 
compare very favorably with the optimum estimates of the standa 
deviation. For п <10 the efficiency of the optimum unbiased graph 10 
estimate relative to the optimum unbiased estimate is greater than 98 
Per cent, asis also the efficiency of the optimum biased graphical estima 
relative to the optimum biased estimate, i 


2 


^ > 


USE OF NORMAL PROBABILITY PAPER 785 


TABLE III-A TABLE III-B 
EFFICIENCY OF THE COMPARISON OF EFFICIENCY OF. 
OPTIMUM UNBIASED VARIOUS BIASED ESTIMATES OF 

ESTIMATE OF с о (RATIO OF MEAN SQUARE 

(RATIO OF DEVIATIONS FROM о) | 

VARIANCES) р ы 

n [ius n [2] [8] (| 

2 100.00 2 99.99 38.79 85.28 
3 99.19 3 99.36 43.04 94.75 
4 98.92 4 99.07 47.90 97.17 
5 98.84 5 98.97 51.18 98.07 
6 98.83 6 98.94 53.65 98.48 
7 98.86 T 99.33 55.65 98.68 
8 98.90 8 99.88 57.33 98.82 
9 98.94 9 99.00 57.78 98.90 
10 98.99 10 99.04 60.05 98.97 


[1] This entry is the ratio of the variance of the minimum variance non-linear unbiased estimate 
to the variance of the minimum variance unbiased estimate which is linear in the ordered 
observations, i.e., columns [1]/[2] in Table II. 

[2] This entry is the ratio of the mean square deviation from e of the non-linear biased estimate 
having minimum mean square deviation to the mean square deviation from е of the minimum 
mean square deviation biased estimate which is linear in the ordered observations, i.e., columns 
[3]/[4]in Table IL." | 

[8] This entry is the ratio of the mean square deviation from е of the non-linear biased estimate 
having minimum mean square deviation to the mean square deviatfén from g of the biased 
estimate based upon the ordinates i/(n+1), i.e., columns [3]]5] in Table II. е 

[4] This entry is the ratio of the mean square deviation from о of the non-linear biased estimate 
having minimum mean square deviation to the mean square deviation from о of the biased 
estimate based upon the ordinates (i —4) /n, i.e., columns [3]/16] in Table IT. 


4, REFERENCES, 


e 
Godwin, H. J., “On the estimation of dispersion by lingar systematic statis- 
ties," Biometrika, 36 (1949), 92-100. p 
Godwin, H. J., “Some low moments of order statistics," Annals of Mathemati- 
cal Statistics 20 (1949), 279-85." * 
* 


[t 


12 


ANALYSIS OF SIMPLE LATTICE DESIGNS WITH 
UNEQUAL SETS OF REPLICATIONS* 


PauL MEIR 
The Johns Hopkins University 


INTRODUCTION 


HE lattice, or pseudo-factorial, designs first introduced by Yates 
dyes and various generalizations of them [7, 8] have proved to be 
quite useful, particularly in agricultural applications. These designs 
are suited to experimental situations in which there are a large number 
of varieties, treatments, or what have you, to be compared under 
conditions which require a relatively small block size. 

Among the wide variety of available incomplete block designs the 
lattices are distinguished by the fact that they combine a relatively 
simple type of analysis with a fair degree of flexibility in the choice of 
the number of varieties to be tested, the number of replications to be 
used, and so forth. 

In most of these designs the basic construction consists of p replica- 
tion patterns. The instructions for analysis, such as given in [2], allow 
for any number of repetitions of any subset of this basic collection of 
replication patterns. Thus if we use p’ patterns from the basic collec- 
tion and repeat this swbset r times we have a design with a total of 
тр' replications, For example, if we wish to compare 25 varieties in a 
5X5 lattice design we have available a basic set of 6 patterns. An ex- 
periment with 6 replications may be designed using all six patterns 
once, using each of three patterns twice, or using just two patterns 
each repeated three times. (See Cochran and Cox [2, p. 281].) However, 
if we wish to use 5 replications our choice would be limited to the quin- 
tuple lattice, and the case of 7 replications is not covered at all. The 
Possibilities are even further restricted in the case of a 6X6 lattice 
for which the basic set includes only 3 patterns. 

Even when the above Prescription can be followed it may not be the 
most desirable procedure, The more patterns from the basic set used, 
the more tedious becomes the analysis, so that in situations where the 
complexity and cost of calculations weigh heavily the designs using 
more patterns, although generally more balanced, may lose in favor. 
Conyersely, it may be possible to achieve a more nearly balanced de- 

* Thi i "E н i- 
wey: Pepin dap e ERES a ne edo iet Пы 


No. 293 from the Department of Biostatistics, Schi i i Ith, The Johns 
Hopkins Ui ity. ies, à 001 ofp Hygiene and Publio Health, 


786 


“= 


ANALYSIS OF SIMPLE LATTICE DESIGNS 787 


sign by dropping the requirement that each pattern be repeated the 
same number of times. 

Finally, although it may not be a frequent occurrence, there are 
unfortunate occasions on which for one reason or another essentially 
all of one or more replications is lost, thus destroying the symmetry of 
the original arrangement, 

For these reasong it may be of interest to note the extent to which 
the ordinary analysis for lattice designs must be modified for an experi- 
ment in which the p’ patterns from the basic set are repeated with 
unequal frequencies. 

In this paper we investigate the simplest case, the simple square 
lattice, which uses only two replication patterns. Following the usual 
convention replications following one pattern are called X-replications 
and those following the other pattern are called Y-replications. We will 
refer to a design using n X-replications and m Y-replications as an 
(n, m) design. 

In addition to the question of relative ease of analysis, the efficiency 
of unequal sets designs relative to alternatives is of interest and this is 
considered in section I-G. The criterion used to measure efficiency is 
the reciprocal of the average variance of all possible comparisons be- 
tween varieties, which is a kind of average “information,” or invariance, 
of comparisons. In experiments of identical construction having dif- 
ferent numbers of replications this quantity will be*proportional to 
the number of replications. Thus, to compare the efficiencies of two 
designs which use different numbers of replications we use the average 
invariance of comparisons divided by the number of replications. This 
gives us an absolute scale on which to'compare any two designs involv- 
ing the same number of varieties. A 

It is not claimed that this is an ideal criterion, but ft is felt to be satis- 
factory for the purpose at hand, namiely, to compare an ufiequal sets 
lattice design with an equal sets design. We will also compare the (2,1) 
design with the triple lattice arfd the (3, 2) design with the quintuple 
lattice. Since a 5X5 lattice is generally considered to be the smallest 
to which the recovery of inter-block information should be applied, 
we give numerical calculations for it. The disadvantage of the unequal 
Sets lattice will be less in larger designs. 

It will be seen that the analysis for the unequal sets simple lattice 
designs differs to only a slight extent from the usual analysis given for 
the equal sets designs. In view of this simplicity such designs may be 
considered legitimate competitors with the more nearly balanced 
alternatives, provided the relative efficiency is not too low. The maxi- 

е LJ 


788 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


mum losses of efficiency for the (2, 1) and (3, 2) designs relative to the 
equal sets designs and the triple and quintuple lattices, respectively, 
are given below for the 5X65 lattice. 

MAXIMUM LOSS IN EFFICIENCY FOR A 5X5 SIMPLE 


LATTICE WITH UNEQUAL SETS.OF REPLICATIONS 
RELATIVE TO OTHER DESIGNS 


Alternative Design (2, 1) Design (3, 2) Design 


Simple Lattice with Equal Sets 6% 2% 
Triple Lattice 12% — 
Quintuple Lattice — 11% 


These maximum losses are realized when block variability is large or 
the intra-block analysis is used. 

The paper is divided into two parts. The first part deals with the 
theory of the analysis and the model behind it. The second part con- 
sists of a numerical example. 


I. DESIGN AND ANALYSIS 
A. Field Design and Mathematical Model 


The simple lattice designs permit the comparison of k? varieties in 
blocks of size К. "Thus п complete replication consists of k blocks, each 
containing k varieties. The construction of replication patterns begins 
by arranging the varieties in a square array. In the first, or X, pattern 
those varieties appearing in any one row go into the same block. In the 

. second, or Y, pattern those varieties appearing in any one column go 
into the same block. in, agricultural work the blocks thus composed 
would be assigned at random to the blocks laid out in a replication, 
and the varieties within a block would be assigned at random to the 
plots within a block. In other types of work the analogous method of 
assignment would be followed. An excellent description of this proce- 
dure is given by Cochran and Cox [2]. 

The mathematical model may be described as follows. Consider the 
varieties in the square array from which the design is composed. De- 
note by vi; the “true” mean of the variety in the ith row and jth column 
(less the over-all mean of the varieties). Denote by z;;, the value ob- 
served for the plot in the rth replication containing the variety Vij- Then 


Tijr = p + А, + vu + es + Bir + е 
where mum 


ANALYSIS OF SIMPLE LATTICE DESIGNS 789 


p= Grand mean 
4,— Replication effect 
v,;— Variety effect К 
_ fith block effect if the rth replication is in the X set 
r= 0 if the rth replication is in the Y set 
. Sith block effect ifthe rth replication is in the Y set 
ir — 0 if the rtk.replication is in the X set 
є = Residual effect. 


In the analysis that follows we will assume that the є; may be con- 
sidered to be normally and independently distributed about zero with 
variance o2. For the intra-block analysis we need no other assumptions 
since block effects are completely eliminated from the varietal com- 
parisons. A direct application of Cochran’s theorem [4] to the analysis 
of variance shows that the residual mean square is a proper estimate 
of error, and by a little additional calculation the analysis can also be 
made to provide an exact test of significance for the null-hypothesis 
that all v;; are equal. 

For the inter-block estimation of varietal effects we must make some 
assumptions about the block effects, о, and Bj». Besides the assumption 
of randomness, insured by the method of assigning blocks within repli- 
cations, it is necessary to assume that block variability is the same 
within each replication. To avoid expository clumsiness it is convenient 
to assume, when dealing with the inter-block analysis, that the block 
effects also are normally and independently distributed with variance 
св?. (The calculation of average mean squares requires only that the 
effects be uncorrelated, but this is not sufficient to justify the applica- 
tion of the édistribution to varietal compasisons,, Of course, strict 
normality need not be required. It has been shown that, for many 
purposes deviations from normality do not seriously affect the analysis 
[5].) The quantities сд? and o, which appear in the expressions for the 
average mean squares represent the variation due to replication and 
varietal effects. Assumptions about the nature of the distribution of 
these effects or their random allocation have no bearing on either type 
of analysis. Except in discussing the recovery of inter-block informa- 
tion, block effects will be considered fixed rather than random effects. 

_As are most of the familiar experimental designs, the unequal sets 
simple lattive is a partially balanced incomplete block (p.b.i-b.) design 
(see discussion in [2]). The application to particular designs of the gen- 
eral method for analyzing a*p.b.i.e. design is nicely demonstyated in а 
paper by Nair [10], However,the inequality in number of X- and Y- 


790 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 


replications in our case gives us a design with three rather than the 4 
usual two associate classes, and it would appear from this viewpoint . 
that the unequal sets case is essentially more complex than the equal .— 
sets designs. However, the analysis follows directly by application of a 
Cochran's theorem without making any appeal to general results from _ 
the theory of experimental designs. In this form the close similarity of —. 
the analysis to that for equal sets is obvious. j 

The problem of choosing a suitable notation for exposition is by no 
means simple. Our object is to provide a notation which suggests the ' 
operations involved without being unduly cumbersome. For the quanti- 
ties which are sums or averages over all possible values of an index we 
use the fairly common convention that a dot (*) replacing an index — 
means the average over all possible values of that index, whereas a — 
plus (+) replacing an index means the sum over all possible values of 
_ that index. For example: 


1 
E. = — Lise} Tij = 24. 
n ER У x ir} j+ У D 

The only difficulty with this notation is that a mean taken over X 
replications, for example, takes the clumsy form í 
„1А ^ 

> HL SD 

A i PCR > i 
where we have, say, R,-X-replications and R;-Y-replications. To avoid 
writing such expressions we use the notation Z; for the above, and — 
2.; for a mean over Y-replications. It follows that Rz;;. = Rady Hey 
where Ё= Ri-- R5 i is the total number of replications. 


B. Analysis of Variance 
The analysis' follows from the identity 


X (vis — ...)? Total j 
Ni By (2... — 2...)* " Replications 1 
TR (z4. — =...) Varieties 
4 9 P ignoring blocks - 
d ef У Y Kei — 2.4) — ee -2)p Blocks A 


b 


1 Expressions suitable for computation are giyen with the numerical example in part II. 


C 


ANALYSIS OF SIMPLE LATTICE DESIGNS _ 791 


T x X [(х.+ ТЕ Zeer) p (®.; TR 221} 


т=Ёү1 7 
Р { Blocks B 
ti — &..) — (%;. — F..) |? 
R X (Gs a era )] (eliminating varieties) 


+ У [(g.; =.) — (= 2.)]] 
М Residual 


Ri 
+{ 29 25 (хук — Lie — 2ш. F Ti: $2.3. — By FE. — 2x)! 


r=1 dj 


R 
+ У) 5 Cite — у ty Ба +1. ҖЕ. = s Т 


reRytl ij 
It can be verified directly that the sums of cross products such as 
У (2r — х...)(ду. — 2...) 
wr 
are all zero. From this it follows that the rank of the quadratic form is 
equal to the sum of the ranks of the various components. It follows 
from Cochran’s theorem [4] that the component sums of squares are 
distributed independently. With the exception of the residual terms 
the ranks of the quadratic forms can be seen directly, and the rank of 
the residual may be obtained by subtraction. ° . ° 
Also, excepting the residual, the average mean squares are easy to 
determine. For example 


+ 


Ave fe > (t... — 2.7} Y „* 


- Ave {is [X (4 — A) + (ag + 8» — a. — B-:) + (e «p. 


As the various effects are assumed to be independent, the average 
values of the cross terms are zero and the expression becomes 


Ave E» (А.А) Е У (а На с B3 


к + » (єє — ed y 
ier WEA 


св m e 


2 e 
= (R œ l)o + 


792 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


and the average mean square for replications is k*/(R—1) times this 
expression. In the same manner we find the other average means 
squares, as usual, obtaining the residual by subtraction. The result of 
these calculations may be tabulated as follows. 


Source of variation Average mean square Degrees of freedom 
Replications Kon? +kos'?+oe8 R-1 
Varieties ignoring blocks Rot i E 1 opto? k?—1 
Blocks A kos’ +o (8 —2)(Е—1) 
Blocks В (eliminating varieties) tkos?+o?P 2(k—1) 
Residual oè —(k—1)(Rk—k—l) 
"Total Rk*—1 


C. Tests of Significance 


The analysis of variance table just presented does not suggest an 
exact test of significance for differences between varietal means. It may 
be, especially in an experiment with a large number of varieties, that 
the experimenter welPknows that differences do exist and the signifi- 
cance test is really beside the point. Cochran [3] has described rather 
completely both approximate and exact tests of significance in the 
case of equal sets of replications. The unequal sets case may be treated 
in the same way. а 

Ап approximate test^may be made by viewing the experiment as а 
randomized complete block design. That is, we regard the variation due 
to blocks as part of the experimental error. This is given effect by pool- 
ing the blocks and residual sums of squares and using the pooled mean 
square to test the significance of the varieties ignoring blocks mean 
square. 

1f we regard the block effects as randomly drawn from а normal 
distribution with mean zero and variance cz, the distribution of the 
test ratio under the null hypothesis is that of the ratio of two mixtures 
of chi-squares, each with the same average value. Thus the null dis- 
tribution is not quite the same as the F distribution. However, the 
error arising from this source appears to be quite small. 

An exect test can be found from an alternative analysis of variance: 
The foregoing analysis can be further subdivided as follows. The sum 


ANALYSIS OF SIMPLE LATTICE DESIGNS 798 


of squares for varieties ignoring blocks may be written 
ВУ, (шту. — T)? = КЕ {х (a. m)! У) Gs. ~ 2..." 
ij i i 
+R Ў) (5. — Bie 23. + g...)? 
è ü 


or SS (varieties ignoring blocks) 

= 55 (varietal main effects ignoring blocks) +SS (varietal interac- 
tions). (The terminology is borrowed from the pseudo-factorial descrip- 
tion of the analysis [11].) \ 

The average mean squares and degrees of freedom are as follows. 


Source of variation Average meansquare Degrees of freedom 
HUC X. a k 
Varieties ignoring blocks Ro? TERI cp! od —1 
Varietal main effects ignoring blocks Ro +}kog? toe 2(k—1) 
Varietal interactions Rot to? (k-1)? 


Now the SS (varietal main effects ignoring blogks) can be replaced by 
an expression for SS (varietal main effects eliminating blocks). 


SS (varietal main effects eliminating blocks) 
= kR У) (Z; = 5.) + kR Y Gam). 
A si? i 
The average mean square is Ro,’ +0 again оп 2(k— 1) degrees of free- 


dom. Unfortunately this sum of squares is not independent? of blocks 
В. To restore independence we, must replace blocks B by 


blocks B, = SS (blocks B ignoring varieties) 
SS (blocks B,) = kE У) (E — 2-)* + Mis У) GT), 
ы i i 


The average mean square is 
x 20 Ф802 + kos? + o j 
• 


‚_ 2 For this reason, the present writeiprefers tp avoid presenting а single table inguding both varie- 
ties eliminating blocks and blocks eae = varieties as in [10]. The degrees of freedonfin such а table 


add up correctly, but the sums of squares ono , 


794 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


on 2(k — 1) degrees of freedom. An exact test of significance for varietal 
effects is now provided by the pooled mean square for varieties elimi- 
nating blocks (on k?—1 degrees of freedom) against the residual. We 
give below the average mean squares corresponding to this analysis of 
variance table. 5 


Source of variation Average mean square Degrees of freedom 
Replications ktor? +koz?+o2 R-1 
Varietal main effects 

92 2(k—1 
(eliminating blocks) {Ке о, hc" 
Varietal interactions Ros o (k —1)* 
Blocks A Ков? +o2 (R —2)(k—1) 
Blocks Bu Rost TF ko? +o? 2(k—1) 
Residual cè  (k—1)(Rk—k-1) 


D. Estimation of Varietal Means 


In the following we will use the notation v;;' for the intra-block esti- 
mate of the true varietal mean, эу, and v; for the inter-block esti- 
mate. v; is obtained by averaging the appropriate values, one from 
each rplicaticn, and adding two corrections to eliminate block effects, 
one for the X-replications and one for the Y-replications. 

vg! = ту. + Cx; + Cv; 
where 

REC EAT 
Сх; = R [(;. S 2.) – (4. — &..)] 
and $ 


R 
Cy; = = [0 ary eam, — 2]. 


An easy calculation shows that block effects are completely eliminated 
from this estimate. However, the elimination of block error necessarily 
introduces additional residual error. If a reasonable estimate of the 
ratio of block variance to residual variance is available, a partial cor- 
rection will give an estimate with smaller variance. Thus, we consider 
an estimate of the form i + 


D , 
vg! = ху. + ACxP+ Cy „ 


ANALYSIS OF SIMPLE LATTICE DESIGNS 795 


where А and > are chosen to minimize the variance of v;;’’. These mini- 
mizing values and the method of estimating them are discussed in the 
next two sections. 


E. Variances of Varietal Differences 


The varietal differences will be of three kinds, each having & different 
variance: М 


1) both varieties in the same X block 
2) both varieties in the same Y block 
3) varieties not in the same X or Y block. 


The variances of varietal differences for arbitrarily fixed values of 
^ and > can be calculated directly. For example, for varieties in the 
same X block we have 
2e R (5 DJ 

2 in ; 


(k +=) 


Vi = Var fw” — w} = Rk T m 


It is easily seen that these variances are all fninimized by the same val- 
ues of \ and >, namely 


` Rikos? { Rikos? * 


= T E Paras pee 
Rıkog? + Ro? £ К Весов? Ro? ; 4 


and for these values of А and > the three variances reduce to 


n* 


2c. В, 
1) V. а ae *). 
LT ERN ООЛ ee 
29, Ra e. 
2) y; - 2 (0, = +) 
= т 
2e? Ri 
3) y 200 gus ЕЕЕ jj 
е ЕЁ РЕ Мр T 


Since \* and »* Ile between zero and one, these variances will not differ 
by much if k is large and Ё and Rz are nearly equal. In this case, the 


average variance, weighted with the number of comparisons of each 


type, may Suffice. This is found to be : 
2 R Hs 
4) 5. m 
VR +DA № ГА 


796 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


In the above calculations the quantities \* and »* are treated as 
constants, whereas in fact they must be estimated from the experi- 
mental data. When k and Ё are both small, the error involved may be 
large. A preliminary investigation which will be reported later indicates 
that when k>6, the error will definitely be negligible. 


F. Estimation of \* and v* 


If the ratio of os? to e were known, \* and »* could be determined 
exactly. In the absence of such knowledge, the ratio can be estimated 
from the residual and blocks eliminating varieties sums of squares. The 
optimum method of using this information has not been determined, 
but the method originally given by Yates [12] seems to be satisfactory. 
The blocks A and blocks B sums of squares are pooled and equated 
to the resulting average value. Proceeding similarly with the residual 
we now have two equations in the two unknowns, св? and c;?. The solu- 
tions of these two equations are substituted into the formulas for \* 
and »*. If it should happen that the pooled block mean square is actu- 
ally smaller than the residual mean square, we take zero as our estimate 
of \* and »*. The resulting estimates of \* and p* are 


Ке b- Se 
M- Ty ifóüz e and A*'— Oif b <e, 
› ратк 
d ox : 
= R1 ifbze and 7* = O0ifb «e, 
ПИШЕ агар 
5 Ra e 


where b=pooled mean square for blocks eliminating varieties, and 
e=residual mean square. For the inter-block analysis the variances of 
varietal differences are estimated as foilows. 

‚ If b2e, we substitute the estimates 4* and >* in place of A* and »* 
in the formulas on P. 795, and use the residual mean square as an esti- 
mate of ce. If b<e, we replace \* and »* by zero and use the pooled 
mean square for blocks and residual to estimate сд, 

For the intra-block analysis there is no need to estimate \* and »*. 
We take \=y=1 in the formulas on р. 795 and use the residual mean 
Square to estimate c? S 

The intra-block estimates of the yarianves of varietal comparisons 
have the usual distribution of a mean square with the residual degrees 


SS 


ANALYSIS OF SIMPLE LATTICE DESIGNS 797 


of freedom. The distribution of the inter-block estimates is complex, 
but it will be reasonably well approximated by a mean square distribu- 
tion with the same degrees of freedom. 


G. Efficiency of the Unequal, Sets Designs 


The usual measure of efficiency for an experiment designed to com- 
pare several similar quantities is the reciprocal of the variance of the 
estimated difference between any two of them. When, as is the case in 
lattice designs, different comparisons do not all have the same variance, 
the usual practice is to use the reciprocal of the average variance of 
all possible comparisons. Since the variance of a comparison in a given 
experiment is proportional to the number of replications used, the 
above quantity divided by the number of replications is a measure of 
the intrinsic efficiency of the design. We need such an intrinsic criterion 
particularly in order to gauge the efficiency of the (2, 1) and (3, 2) 
designs relative to a lattice with equal sets of replications. 

The efficiency of a given design relative to an alternative will be 
measured by the ratio of the above criteria for the two designs jn ex- 
periments for which cs? and c, are the same for both designs. The 
calculation of variances is straightforward and results in the following. 


. ! 


L4 
AVERAGE VARIANCE OF VARIETAL COMPARISONS FQR 
VARIOUS DESIGNS 


Design Average variance of comparisons 
- і & 
2 LJ 
Randomized Blocks xal TE zn] 
е 
T ° 264 ГОН n 
Lattice with Equal Sets "l| DTE Е ert] 
é e 20.2 Ков? 
Triple Lattice S0 E Дату н] 
Т ' 2c? Ков? 1 
: —— | k-4L————01 
Quintuple Lattice REFI) [, Тоа M 


ELI uc O * ЖОО EN ЫА ЫШ 


The triple and quintuple lattices are included for comparison with | 
the (2, 1).and 3, 2) designs. Comparing the average variance of om- 
Parisons in the unequal setsdattice with the variances listed we find the 


following relative efficiencies. є 
e ы * 


798 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 104 


EFFICIENCY OF UNEQUAL SETS LATTICE RELATIVE TO 
ALTERNATIVE DESIGNS 


Relative efficiency* of (Ri, Ra) 
simple lattice 


kop? ‹ 


ztl 


Alternative design 


© k+ 
[f 
Randomized Blocks CN ONG 
1 2 
: Boat y 
by Mg t 
kop? 
Кы uk 
н Зов? +o? HE 
В, * 
R ›*-+1 


Lattice with Equal Sets R 
kt M 
Rı 


kos 


Каа +1 


Triple Lattice 
2 


Ri R: 
St ya LA yH 
kt ta + 


kog? 


Е 1 
y thos? +o? р 
Е; Rs 
kH Ath #*+1 
F Ra Tk ve 
PERENA S AO UAA E S E С 
* These relative efficiencies are calculated without allowance for the inaccuracy of weighting. This 


discriminates against the randorZized block design which is, in fact, somewhat more efficient than the 
alternatives (rather than slightly less) when ор? is very small. 


Quintuple Lattice 


The relative efficiencies using the intra-block analysis may be ob- — 
tained by replacing \* and »* by one and, in the case of the alternative — 
lattices, taking the limit as‘cs*/c2 becomes large. For the randomized 
block design there,is no intra-block analysis and the efficiency remains n 
a function.of o5?/c2. » i 

In the comparisons with other.lattice designs it will be noted that — 
the relative efficiencies approach one when № is large, or when cm — 
approaches zero and the inter-block analysis is used. It can also Беш 
verified by straightforward algebra that the least favorable values for - ў 
the unequal sets designs occur when ов? is large, or when the intra- и 
block analysis is used. Since k=5 is frequently taken ás the lower limit ] 
for reasonable accuracy in the recovery of inter-block information, e£ 4 
[2], we will examine this case in detail. Hn 

Wie note first that our unequal sets design is the least efficient of the _ 
lattices compared. This is only to'be expected since it is the furthest Д 
from balance. However, compared ‘with the equal sets design it does 
not do at all badly. In the worst case, that of large vs?/c2, or the intra- _ 


ANALYSIS OF SIMPLE LATTICE DESIGNS , 799 
block analysis, the efficiency relative to the equal sets design becomes 
k+3 


Thus, for a 5X5 lattice the (2, 1) design is at worst 94 per cent efficient 
and the (3, 2) design 98 per cent efficient. These efficiencies will im- 
prove with larger К, as noted earlier. Hence, to a fair approximation, 
the efficiency of unequal sets designs relative to alternatives is about 
the same as that of the equal sets designs, provided the inequality in 
the number of X- and Y-replications is not excessive. 

'The advantage of the unequal sets designs lies in the simplicity of 
the caleulations required, these being no more arduous than for the _ 
lattice with equal sets. In the triple and quintuple lattices, on the 4 
other hand, the additional labor involved in finding the adjusted means 
may be considerable. In circumstances where the cost of analysis is an 
appreciable portion of the cost of experimentation, the unequal sets 
designs may be regarded as competitors of the more balanced, but 
more complex, triple and quintuple lattice designs. The least favorable 
situation for the unequal sets designs relative t8 the triple and quin- 
tuple lattices is again found to be the intrasblock analysis with k small. 
We see directly that using the intra-block analygis with Б = 5, the (2, 1) 
design is 88 per cent efficient relative to the triple lattice afid the 
(8, 2) design is 89 per cent efficient relative to the quintuple lattice. 

If it is known in advance that the block variability is not excessive, 
we may be assured of somewhat grefter efficiency. The usual measure 
of block variability is the ratio of the variance between blocks to that 
within blocks which is dan 

kos? + аё 5 
y= 
Oe 
(or w/w! in the notation of [2] and [12]). We see that the efficiencies 
can be expressed as functions of y. In the above two cases the maximum 


E in efficiency will be reduced by at least half if y is no greater than 
ve. у: 


п. NUMERICAL EXAMPLE? 
. n 4 ёр 
The material used in the example is drawn from two separate ex- 


3 The writer is indebted for the experimental fata to Dr. P. H. Harvey, Agronomist, Bureau of 
Hes Industry, U. S. Dept. of Agrononfy, North Carolina State College, аага C. ue ae 

Sia i „7. , Noi 'erolingtsi 'ollege, 
Rin NIG oe она сше of Dr. R. J. Monroe 


800 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 194 


periments at different locations. Both experiments used the same varie- 
ties and design, each with two X- and two Y-replications. To construct 
an example of a (3, 2) lattice, an X-replication from the second experi- 
ment was added to the first experiment, giving three X- and two Y- 
replications. Any interpretation of our numerical results must bear in 
mind the artificial nature of the “experimen” analyzed. The crop in 
question was corn and the variable was field weight of grain in pounds, 
Each plot consisted of 30 (=2X15) single plant bills. 

Table I presents the basic data with the totals for each block. In 
Table II we have the X-replications combined and the Y-replications 
combined, with row and column totals in each case. Table ПІ gives the 
grand total for each variety with the row and column totals. 


TABLE I 
SINGLE PLOT YIELDS 


X X-REPLICATIONS 88 

14.2 16.0 14.8 16.7 16.8 16.3 94.8 

14.6 15.8 16.0 14.9 12.5 15.2 89.0 

17.8 19.8 19.4 10.3 13.5 15.6, 95.9 

16.0 13.6 . 15.3 11.0 14.4 15.4 86.3 

Тала 001218 5115.000 19.8. 15.8 13.1 82.0 

б) 14:0; 15.8.5.17.0 017.0 | 18.5 95.5 

544.4 83711,8 

Xi 

18:210/15:00.:018.9 Б 12.2. ^ 15.1-, 84.5 

dn 8:32 .018:1.00030)43 19.00. 16.9 92.2 

18:8 19.0 15.9, 13.5 18.0 14.4 94.6 

Tortie dae EO UI25 —14.1 . 14-0 83.9 

14.0 13.0 13.0 15.8 ^ 14.5 11.1 81.4 

16.2 148 153 18.0 .16.6 12.5 93.4 

-530.0 79332.8 

X; 

14.0 18.3 19.3 143 18.3 14.9 99.3 

ОТОО ОО аа В 12,8 - 14,2 ^ 98.0 

J1.8 20.1 15.4 16.7 18.8 171 99.4 

16.2 15.8 15.6 9.8 18.0 160 91.4 

12.2.5 10.8 16:38. 12.7 15.7. 15.3. 802.5 

16:1 (246 . 18.7 19.7. 16.8 12.9 98.2 


563.8  . 90884.2 


ANALYSIS OF SIMPLE LATTICE DESIGNS 45 801 


Y-REPLICATIONS j 
Y: 
17.3 15.0 19.5 19.6 17.7 20.0 109.1 
18.7 : 18.8 20.2 19.3 15.8 17.0 109.8 
16.4 23.2 19.1 16.5 16.9 16.7 108.8 
18.0 22.8 15.5 , 13.2 18.3 19.0 106.8 
15.1 14.0 14.0 16.2 12.6 17.6 89.5 
18.1 16.0 13.2 16.5 14.0 15.1, 92.9 


616.9 107923.5 


Үз 
7 15.3 13.1 16.9 91.4 
Š 4 16.8 12.5 18.2 97.0 
17.5 22.5 15.4 19.7 18.6 17.8 111.5 
8 
1 


18.5. 18.9 12.8. 14,85) 46:8 оо 

18.5 15.1 18. 17.1 16.6 16.4 101.8 

19.6 18.2 14.0 17.5 15.1 14.1 98.5 
597.2 100730.4 

TABLE IL 
COMBINATION OF REPLICATIONS 

X=XitMi+X ) i Totals ЕБС 
as 49.9 aso Ева TRUST 85.7 
46.0 51.0 53.1 38.9 38.8 46.3 274.2 100.8 
42.4 58.9 50.7 40.5 503 47.1 2899 10.9 
47.8 43.1 44.9 93.9 4605 45.4 261.6 84.3 
30.4 35.6 44.5 42.3 45.5 $9.5, 246.8 68.9 
49.0 43.9 49.8 54.7 60.8 38.07 287.1 45.3 


260.8 282.4 291.0 255.8 27877 263.5 1638.2 ° 365.9 


YsY;+Y; d RRikCy; 
31.9 зов 35.2 34.9. 30-8 36.9 200.5 = 67.9 
32.7 34.9 39.6 36.1 28,3 35.2 206.8 —55.6 
33.9 45.7 34.5 36.2 35.5 34.5 220.3 —78.9 
34.5 41.7 5 28.3. 28.0 34.6 36.7 203.8 —99.8 
33.0 291 321 33.3 29.2 340 191.3 —16.5 
37.7 342 27.2 34.0, 29.1 29.2 191.4 —47.2 


Lo 
204.3 216.4 196.9 202.5 187.5 206.5 1214.1 —365.9 


802 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


А. The Analysis of Variance 
The analysis proceeds in the usual way as follows. The correction 

term is given by 

(t)? _ (2852.3)? 


C= 
Rk? 180 


= 45197.86. , 


The total sum of squazes is obtained by summing the squares of each 
plot yield and subtracting the correction term. 


Total SS = У) (za)! — C = (142)? + (16.0)? + --- + (141? - C 
dir 
— 46258.27 — 45197.86 — 1060.41. 


TABLE III 
VARIETY TOTAL YIELDS 


73.5. 82.6 81.9 800 809 84.0 482.9 
77.4 85.0 98.8 80.6 67.4 80.5 490.6 
77.6 98.5 85.2 68.8 82.4 74.3 486.8 
82.7 79.2  8L1 61.9 79.8 79.4 464.1 
70.2 63.9 80.0 76.9 747 68.6 434.3 
85.0 79.1 843 * 91.4 848 68.1 493.6 
407.3 480.2 — 511.8 450.6 470.0 454.9 2852.3 
Similarly, 


1 
Replications SS = "E У (ta) -CE 


1 
ET [(544.4)? + (530.0)? + (563.8)? + (616.9)? 


+ (597.2)?] — С 
= 45343.20 — 45197.86 = 145.34, 


Varieties ignoring blocks SS = = È (ша) – С 
=> Е ў 


EI 


= Ae [(73.5)2 + --- + (68.1)2] С 


5 
= 45667.87 — 45197.86 = 470.01. 


ANALYSIS OF SIMPLE LATTICE DESIGNS 803 


The blocks A sum of squares is computed in two parts, one from the 
X-replications TM A’) and one from the Y-replications (blocks A"). 


Blocks A' S8 2 — = S Cu) ——— zi (=)? 


Li T1 
" 1 P» 7 2 
As E we) 
1 
re [(94.8)? + - - - + (98.2)2] 
- M [(644.4)* + (530.0)? + (563.8)7] 
- м [(278.6)? + · - - + (287.1)?] 


— [1638.2]? 
+! ] 


= 13.77 
X-REPLICATION BLOCK TOTALS 


A > Total 

94.8 84.5 99.3 278.6 

89.0 92.2 93.0 274.2 

95.9 94.6 А 99.4 289.9 

86.3 83.9 91.4 261.6 

82.9 81.4 82 246.8 

95.5 93.4 98 . 287.1 

e = ETSI 

Total 544.4 530.0 . ы 568.8 1688.2 


1 R 
Blocks А” 88 = — Y; È Gut- 0) (ee 
EU 


j r-RyH k 
T T R 2 ) 
EDUC ^ — o 
< 1 = 
2 [(109.1)2 + - Aser (98.5)2] ° 


1 Я Ч 
е gg 616%) +,(597.2)2] 


804 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


= -. [(200.5)? + - - - + (191.4)7] 


+ : [1214.1]? 
72 5 


= 58.20 
Y-REPLICATION BLOCK TOTALS 


Total 

109.1. 91.4 200.5 

109.8 97.0 206.8 

108.8 111.5 220.3 

106.8 97.0 203.8 

89.5 101.8 191.3 

92.9 98.5 191.4 

Total 616.9 597.2 1214.1 


In the case of blocks A’’ we have only two replications which permits 
a simplification of the Se eom 


Blocks А” SS = = уз (кы 0)? — sis nea ан)" 
2 4 nma 


17.7)? + (12.8)? + - - - + (5.6)? 
= Fg (070* + 028) + «+ + (6.6)*] 
> i (19.7)? 

ое Зву 
ч - = 58.20 

Y-REPLICATION BLOCK TOTALS 

Diff. 
gS ee SES 

109.1 91.4 17.7 
109.8 97.0 12.8 
108.8 111.5 E 
2306.8 97.0 9.8 
а 89.5 101.8 —12.3 
92.9 98.5 — 5.6 
Total 616.9 597.2 > 19.7 


E" 


ANALYSIS OF SIMPLE LATTICE DESIGNS 805 


The blocks B sum of squares is also computed in two parts, It will 
be noted that the quantities to be squared are proportional to the un- 
adjusted correction terms for blocks, Cx; and Cy;. In this computation 
the terms in %,.—%,, may be ignored as the calculation corrects for 
means automatically. Thus in place of Cx; and Cy; it is more con- 
venient to*use A 


Thus 
RRACx = В.Е) — RY(Raz) 
= R,(ith column total for group Y) 
— R,(ith row total for group X) 
and similarly, 
КВС; = Ro(jth row total for group X) 
` 5— R,(jth column total for group Y). 


. 
For example: е $ А 
RRACx, = 3(204.3) — 2(278.6) = 55.7. 
In this manner, we obtain: & 
= Ww 
КВО ** RRkCr; 
ys.  __——_—__ 
55.7 Жз —67.9 
100.8 —55.6 
10.9 ° —18.9 
84.3 —99.8 
68.9 —16.5 
45.3 —47.2 y 
365.9 —365.9 


Sa ee 


«т 
Ав а computational check, it may be noted that the sum of the 
ВЕЗАО x, is just РРР, (sum of alt X plots—sum of all Y plots). The 
Sum of the RR:kCy;! is the negative of this. Sie 
© may now write the sum of squares for blocks B. 


[4 


^r 


(А 


806 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


1 
Rs /)? – ВРО х4")? 
Blocks B' SS HERI tae (ВРС х) [1:3 X7 АСЕ) 
= —————— [(55.7)? + (100.8)? + -- - + (45.3)? 
recreo d ) ( ) (45.3)°] 
— ——___— (365.9)? 
5X3 X2x 6? 
= 27.63. 
” 88 = 1 RRikCy,)? 
Blocks B" SS E x (RRakCr /) VM kCy4) 
= - — — [(07.9)* + (65.0)? + «++ + (47.2)3] 
reperti )* + (55.6) ( 
— —— — (365.9)? 
5хзх2 х6? 
= 22.63. 


The residual sum of squares may now be obtained by difference: 


Residual SS = 1060.41 — (145.34+4470.01-+13.77-+58.20-+27.63+-22.63) 
= 322.83, 


The above calculations give the analysis of variance table as follows. 


TI Dcgrees of Sum of Mean 
eset pa a Freedom Squares Square 
кек ы С.Д}; D Je RE 

Replications 4 145.34 36.335 

Varietiés ignoring blocks 35 470.01 13.429 

Blocks A 15 71.97 4.798 

Blocks B 10 50.26 5.026 

Residual 115 322.83 2.807 
Total 179 1060.41 


B. Tests of Significance 


A simr^» approximate F-test for varietal effects is made by com- 
bining the blocks and residual sums of squares, giving an error term 
of А 


© 7197 + 50.26 + 322.83 = 245.06 оп 140 d.f. 


ANALYSIS OF SIMPLE LATTICE DESIGNS 3 807 


or a mean square of 3.179. This is to be compared with varieties ignor- 
ing blocks, giving 

13.429 

F= = 4.224 on 35 and 140 d.f. 

3.179 . 
which is highly significant? The test is not exact because the error term 
is a mixture of mean squares. The error in significance level caused by 
using this approximateion has been investigated by Coehran [1]. In 
а case such as this the error will be quite small. 

If an exact F-test is required, we must perform a little extra compu- 
tation, We require the sum of squares for varieties eliminating blocks, 
which is most easily found from the identity 
is (varieties eliminating blocks) 

+88 (blocks ignoring varieties) 
= {SS (varieties ignoring blocks) y 
--SS (blocks eliminating varieties) 


Now SS (blocks ignoring varieties) 


-FE Ў Guy -pÈ ead) 


EFÈ, e >, eol gd 


1 
VAY [(94.8)? + - - · + (98.2)?] — — Dlou 4)? + (530.0)? + (563.8)?] 


1 $ 
+ | [(109.1)? + = - - + (985)] — E (616.9)? + (597.2)8] 


= 195.19. ° 
Hence SS (varieties eliminating blocks) 
= 470.01 + 122.23 — 195.19 = 397.05 on 35 df. . 
and the corresponding mean square is 11.344. This is to be tested 
against the residual mean square, giving 
m 
11.344 


= 4.012 en 35 and 115 d.f., 
2.807 e д A 


Which is also highly significant.” 


808 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1054 
C. Estimation of Varietal Means 
If the intra-block analysis is to be used, the estimation of varietal 


means is now trivial. We need merely adjust the raw varietal means by 
adding the quantities Cx; and Cr;, i.e. 


sg! = ху. + Oxi + Gy;. 


The use of Сх; and Gy,’ in place of Cx; and Cy; introduces a constant 
bias which does not affect the comparisons and is generally quite 
small. The bias is readily removed by adding the quantity —1 /k(Cxy! 
+Cr..’) which in our case is equal to —0.34. 

To calculate the inter-block estimates of the varietal means, we 


` must first estimate the coefficients \* and »*. The formulas for these 


estimates appear on p. 796 and reduce in this example to 


Ree’ 
b+e 
Sip eae 
1 
ара 
Tee 


where b pooled mean square for blocks eliminating varieties and e= 
residual mean square» In this example 
Ta 71.97 + 50.26 
25 


4.889 
and ¢=2.807 so that, 


Ў = 0.2705 and $* = 0.3574. 


Multiples of the unadjusted correction terms have already been 
calculated, namely, RRkCx;’ and RR;kCy;'. The adjusted correction 
terms are found by multiplying these quantities by 


X* — 02705 ӯ* 
ЕО 03074008006 


RRk 60 RRk 90 
respectively. (Replace À* and 7* by one for the intra-block estimates). 
Thesé Corrections are appended to the table of varietal means (Table 
IV) and the appropriate terms added to each raw varietal mean yield- 
ing the corrected means (Table V), ; 
The additional adjustment due to using Cx, and Cy,’ in place 9 
Cx; and Cy; is > 3 


ANALYSIS OF SIMPLE LATTICE DESIGNS 809 
| 1 ўж "19% D 
= TÀ Oxy +7 Cy, = — 0.08, 


which may be added to each varietal mean, if desired. 

It may be of interest to point out that the need for these corrections 
is peculiar.to the unequal séts designs. The corrections are identieally 
zero in an equal sets design. 

TABLE IV . ° 
VARIETY MEANS (UNADJUSTED) 


| У ; XCxé 


14.70 16.52 16.38 16.00 16.18 16.80 0.251 
15.48 17.15 19.76 16.12 13.48 16.10 0.454 
15.52 19.70 17.02 13.76 16.48 14.86 0.049 
16.54 15.84 16.22 12.38 15.96 15.88 0.380 
14.04 12.78 16.00 15.38 14.94 - 13.72 0.311 
17.18 15.82 16.86 18.28 16.96 13.62 0.204 


$Cy/ —0.270 —0.221 -—0.313 —0.396 —0.066  —0.187 


{ TABLE У 
VARIETY MEANS (ADJUSTED) 
LI 

14.681 16.550 16.318 15.855 16.365° 10.864 
15.664 17.383 19.901 16.178 13.868 16.367 
15.299 19.528 16.756 13.413 16.463 14.722 
16.650 15.999 16.287 „12.364 16.274 16.073 
14.081 12.870 15.998 15.295 15.185 13.844 
17.114 15.803 16.751 18.088 ,. 17.098 13.637 


`e 
D. Variances and Standard Errors of Varietal Differences 


The variances of varietal differences depend on whether or not the 
varieties being compared appear together in some block. We give first 
the variances and standard errors for the intra-block analysis, using 


the formulas on p. 795 with А* and »* replaced by one. . 
Intra-block Variances and Standard Errors 
a) Varieties in same X Block =) 
SU 2 x,2.807 236 P 
Vi = Var (v! — nr} = PART. [6 Ez гт Е 


o 
S.E. (nó — v} = 1248 = 1.117 


° 


810 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 

b) Varieties in same Y Block 

Vi = Var (vs! — w} = [s + sl = 1.404 
5 x6 2 
S.E. fon’ — њи} = 4/1404 = 1.185 

c) Varieties not in same Block j 
2 X 2.807 2 3 
enos 
S.E. (vu — veo! } = v/1.528 = 1.236 


d) Average of all comparisons 


y = Var (v! – 0} = 


2 x 2.807 2 3 
Vil = Av. Var = — — —|6-4-— 4 — 44] = 1470 
MM per tst; 


Av. S.E. — 4/1.470 — 1.212. 


For most purposes the average standard error, 1.212, would be ade- 
quate. 

The variances appropriate to the inter-block analysis are also de- 
rived from the formulas on.p. 795, using the estimates À* and $* of \* 


and »*. 
> LI 


Inter-Wock Variances and Standard Errors 
a) Varieties in same X Block 
Vil = Var fon” — a] = SOS X 2807 
M 5x6 
S.E. [7 — on} = \/Т167 = 1.080 
b) Varieties in same Y Block А М 
2 X 2.807 
5x6 
В.Е. {un — va"] = J/1.199 = 1.095 
c) Varisties not in same Block 


[s +2 0.3574 | = 1.167 


Saves foul? — tu] = Ç +> (0.2705) | = 1,199 


> 


Py 2х2. 
V3" = Var fon!” — va} = SEED -> (0.3574) 
ә 


Ly US 


ANALYSIS OF SIMPLE LATTICE DESIGNS 811 
3 
T (0.2708) | = 1.243 


S.E. fou!’ — 0") = V1.243 = 1.115 
d) Average of all comparisons 
F,” = Av. Var = dr = 2 (0.3574) + ta (0.2705) + | 
5x7 3 neo 
1.226 
Av. S.E. = 4/1.226 = 1.107. 


ll 


Again, for most purposes, the average standard error would be ade- 
quate. We see that the estimated average variance in the inter-block 
analysis is 17 per cent less than the estimated average variance in the 
intra-block analysis. 

Due to the fact that Х* and 2* are estimates rather than true values 
the above variance estimates tend to be a little too small, i.e. they have 
a negative bias. An approximate adjustment made to remove this bias 
changes the estimated standard errors in this case by less than one per 
cent [9]. 


E. Efficiency Relative to Alternative Designs 


. 
The formulas used to gauge relative efficien@y are given on p. 798. 
In using them we will estimate o2 by the residual mean square, 2.807, 
and kc; by the quantity (b—e)R/(R— 1) = (4.889—2.807)5/4 — 2.608. 
The estimated efficiency of this particular experiment using intra- 


block analysis relative to randomized blocksis | 
m 
1.272 x 
— — = 0.865 86.5 
А. { 


a loss of about 13 per cent. If the inter-block analysis is used, the rela- 
tive efficiency becomes 


a gain of not quite 4 per cent. 
The efficiency relative to an equal sets design is very с1оё® 100 per 
cent by either method of analysis, Using the intra-block analysis the 


efficiency is э с AA 


812 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


ызы = 0.982 ог 98.2%. 


А а 
$ 409 


Using the inter-block analysis we calculate an efficiency of 99.86 per 
cent. ag 

The quintuple lattice does not exist for a 6X6 design, so there is no 
point in making that comparison for a lattice experiment with 36 
varieties, 


F. Conclusions on the Numerical Example 


If the above findings were made relative to an actual experiment 
instead of our synthetic one, one might draw the following conclusions. 

a) The lattice design did not appreciably improve the accuracy of 
the experiment relative to what might have been expected from a 
randomized blocks design. — 

b) The use of the inter-block analysis has saved the experiment from 
а considerable loss (13 per cent) relative to a randomized blocks design. 

с) The loss of efficiency due to using unequal sets of replications 
was negligible. 


[Note: In the two original experiments from which our data were taken, 
the apparent gains relative to randomized blocks (using inter-block 
analysis) were 11 per cent in the first experiment (from which we took 
four replications) and 3 per cent in the second (from which we took one 
replication),] 


ACKNOWLEDGEMENT 


The writer would liké to express his thanks to Professor John W. 
Tukey for suggesting this problem. 


BIBLIOGRAPHY 


Ш Cochran, W. G., “Problems arising in the analysis of a series of similar ex- 
оа Journal of the Royal Statistical Society, Supplement 4 (1937), 

[2]- Cochran, W. G., and Сох, С. M. i i York, John 
bites wisi sane [у › G. M., Experimental Designs, New , 

[3] Сох, G. М., Eckhardt, R. C., and Cochran, W. G., *The analysis of lattice 
and trine lattice experiments in corn varietal tests,” Iowa Agricultural Ex- 
periment Station Research Bulletin 281, 1940. : 

[4] Cramer, H., Mathematical Methods:of Statistics, Princeton, Princeton Uni- 
versity Press (1946), p. 116. 


[5] David, F. N., and Johnson, N. L., “The effect of non-normality on the power 


ANALYSIS OF SIMPLE LATTICE DESIGNS £ x 813 


function of the F-test in the analysis of variance,” Biometrika, 38 (1951), ` 
43-57. 

[6] Grundy, P. M., “The estimation of error in rectangular lattices,” Bio- 
metrics, 6 (1950), 25-33. 

[7] Harshbarger, B., “Rectangular lattices,” Virginia Agricultural Experiment 
Station, Memoir 1, 1947. , 

[8] Kempthorne, O., and Бедегег, W. T., "The general theory of prime power 
lattice designs IT," Biometrics, 4 (1948), 109-21. 

[9] Meier, P., “Weighted means and lattice designs," Doctoral thesis, Princeton 
University, 1951. 

[10] Nair, K. R., * Analysis of partially balanced incomplete block designs illus- 
trated on the simple square and rectangular lattices,” Biometrics, 8 (1952), 
122-55. 

[11] Yates, F., “A new method of arranging variety trials involving a large num- 
ber of varieties,” Journal of Agricultural Science, 26 (1936), 424-55. 

[12] Yates, F., “The recovery of inter-block information in variety trials ar- 
ranged in three-dimensional lattices,” Annals of Eugenics, 9 (1939), 136-56. 


. 
е ә 
. 
° 
» 
eo 
` 
е LJ 
. LJ 
ә 
д * 
à = 
* 
e 
o 
E °° 
° 
o? Г] 
А x 


ON THE PRESENTATION OF THE RESULTS OF SAMPLE 
SURVEYS AS LEGAL EVIDENCE* 


W. Epwarps Demne 
New York University 


PURPOSE OF THIS PAPER 


HE purpose here is to view some of the problems that confront the 
Lo when he presents the results of a sample survey as legal 
evidence. One particular point is that the statistician, if he is to make his 
work useful, must distinguish between (a) what he as a statistician may 
say about the precision of the results of his survey, and (b) what an expert 
in the substantive field may conclude about the usefulness of the results. 
The statistician can testify only to the former, and possibly also about 
the variance between investigators, and between different methods, if he 
measured these differences. As a secondary purpose, we shall enquire into 
the meaning of a standard error, and its relation to a complete count and 
to the usefulness of the results—a point that is often overlooked, not 
only in testimony but in statistical reports. 

І have no magic nor all the answers to all the questions and difficulties 
that the statistician will encounter when he presents results as evidence. 
It is possible, however, to share some experiences with colleagues in this 
increasingly important fole of statistical surveys; to acquaint them with 
some of the kinds of problems that may arise; and to suggest some general 
principles that will help the statistician to make his work more useful 
than it would be otherwise. ‘ 

At the outset I may explain that this paper will deal only with prob- 
ability samples. The defénce of any other kind of sample is hardly a 
problem for,a statistician anyhow; but rather for the substantive expert 
who may have enough knowledge of the material and of its variability 
to feel that he can testify one way or another with respect to the interpre- 
tation of the results of a judgment sample. 

A statistical survey, formulated and carried out by the dictates of the 
theory of probability, is to the statistician an exciting and remarkable 
achievement. It produces man’s best empirical knowledge, and it provides 
an objective measure of the amount of knowledge in the survey. The pre- 
cision desisad can be aimed at and hit pretty accurately by planning 
їп advance with the aid of the theory of probability and with bits of knowl- 
edge with respect to certain Proportions, means, correlations, variances, 
and other statistical measures of the sampling units in the frame. Then, 


* Presented at a meeting of the American Statistical Association in Chicsgo, 29 December 1952. 


814 


RESULTS OF SAMPLE SURVEYS AS LEGAL EVIDENCE 815 


after the survey is completed, the precision that was actually reached is 
calculable and expressible in an international standard of measure (the 
standard error) from the results of the survey itself. This final measure 
of the precision is objective, and is not a matter of opinion. It is not biased 
by incorrect assumptions that went into the planning. 

The statistician when he presents his results as legal evidence finds 
himself nevertheless: at an uncomfortable disadvantage. He is usually 
talking to scholars, but not to fellow statisticians, nor indeed to other 
scientists, nor relating the results of his research to a trusting client or 
sponsor. He is teaching, but the techniques of the class room will not 
necessarily be the best ones for the presentation of evidence. 

Scholars in other disciplines are not all acquainted with the achieve- 
ments of probability sampling, yet the statistician must somehow explain 
his methods to them. Some of the people that he must deal with in legal 
evidence know sampling only as a failure to predict an election; they 
know not the distinction between (a) the standard error of sampling, (b) 
the errors common to complete counts and to samples, and (c) the error 
of a prediction. Other people think of sampling as a selection by judg- 
ment, carried out by someone who has established a reputation by a run 
of successes in the past. To still others, sampling is a desperate risk, a 
hazardous aimless random drawing of areas or of other elements to which 
anything may happen, and concerning which nothing сап really ever be 
known except by comparison with a complete court, for which a sample is 
only a substitute to save time and money. 

In my own experience, a man questioned the existence of the theory 
for estimating the variance of amean, originated by Gauss 120 years 
ago, and now used all over the world. I was ore accused of “pyramiding” 
my results (whatever that is), because I took the average of the averages 
of my 10 subsamples for an estimate of the whole, wherefore my standard 
error “must be viewed with some.doubt.” 


DIRECT TESTIMONY: CROSS-EXAMINATION 


_ There is first of all direct testimony, wherein the statistician presents 
his results, after careful preparation in advance. He will usually simply 
Tead his direct testimony into the record from typed copy. Direct testi- 
mony may take the form of questions and answers, the questions being 
read by the lawyer who has engaged the statistician. The questigng should 
be framed so that they display the results of the survey in the form of 
valid Statistical inferences. The questions must not sound as if they were 
Å ging for particular answers, even»though both sides in the ease know 
ull well that the statistician is reading from prepared copy, and that the 


816 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 


answers are exactly what the statistician believes to be essential to 
methods and to his results, regardless of the questions. 

In the preparation of testimony, the lawyer who engaged the stat 
tician will not try to influence the content of the statistician's statem: 
He will try to help the statistician to state his procedures and his ini 
ences so that they will be clear. The inferences must be оШу what 
statistician can support, as a scientist seeking truth. To bring out 
truth in a scientific inference, one must not only state what he beli 
to be true, but he must say it so that his listeners will understand what: 
he means, and not think that he las said something that he did not mean, 
A good lawyer can help immeasurably in achieving this aim. 4i 

In giving evidence, the statistician is not fighting a case for either side, 
He is an expert witness, and he should appear as a professional m 
with the sole aim of presenting the truth. This means that he must tell 
the best of his ability what the figures mean. He should describe in full 
any difficulties that he encountered, and their possible limitations on th 
interpretations of his data. 

In some courts one may not read prepared testimony, in which 
one can only prepare to present his testimony without the aid of his 
copy. He will of course still be able to present tables and charts, cal 
exhibits, ET 

Usually during or immediately following direct testimony the oppo 
Side asks only questiens that will clear up simple failure to re 7 
technical terms, ог to clarify some events with respect to their sequence — 
in time. Questions that may bring out flaws in the testimony, they will 
usually reserve for further study, following which they will call the statis- 
tician to the stand for cross-examination. Here the questions are often 
well-prepared in advahee, but the statistician must answer ex tempore. 
Here the statistician may find himself very uncomfortable to find thal 


When cross-examination comes, no matter what question comes, Ie 
vant or irrelevant, do the best that you can with it. Be cautious to stay 
within the field of competence that you have testified toin your qualifica- _ 
tions (vide infra). Groundwork in your direct testimony, in an attempt 
to give clear explanations of your procedures, of the statistical interpreta- 
tion of your results and of their standard errors, will help to keep the 1 
examination on the track and to bring out the inherent scientific truth 


contained, in your survey. о 


à 


4H 


, c - boi x te 


RESULTS OF SAMPLE SURVEYS AS LEGAL EVIDENCE. ; SHR ae mU 


To present the results of a survey in a case where millions of dollars - 
are involved, to ears unfamiliar with the power of modern statistical. ae Jet 
„practice, is an experience that purifies the statistician's thinking. Some- * Мі e 
times the listeners are glad to accept the results of a good survey, and = E 
to learn something about modern survey-methods. At other times, they ^. 
will declare that the statistjcian’s methods are new and untried, that his — 
results are therefore not acceptable evidence; that his sample was too 
small; or finally, foresooth, that he has not explained the entire theory 5 
of sampling so that everyone can understand exaetly wbat he did and КҮ 
why, and that there is therefore no basis by which to judge whether his LUN 
results have any meaning. р 


ESSENTIAL INGREDIENTS OF THE DIRECT TESTIMONY _ i = 


The statistician’s statement of his qualifications, which usually comes 
in the first part of his direct testimony, is important. It is evidence by 
which the examiner or judge may decide, if the question arises, whethe 
the statistician is qualified. It should therefore contain a full account of - 
the statistician’s education and relevant experience. E: mrs 

He may then present the purpose of the survey (an example of an as- - 
signment will occur later), what he endeavored to do, the methods that 
he prescribed, the basis for these methods, the system and the observa- | 
tions by which he satisfied himself that the procedures that he prescribed 
were understood and followed rigidly and faithfuMy; finally, the results 
and their standard errors and their interpretation; also the possible effects 
of any biases inherent in the procedure, and the possible effects of апу : 
difficulties encountered. All these points will go into the direct testimony. «ЖУ 

He should tell in simple words what the procedures actually were. He - 
should limit theory to a few simple and well-established principles that 
illustrate the sampling procedures and £he interpretation of the results. — 
The truth and the whole truth means clarity, so that anyone may judge ү: 
whether your results and your interpretation of the standard error are _ iy 
What you say they are. You can not hope to give a whole course in the 
theory of sampling, but you can make your procedures and their validity 
clear without doing so. The most convincing argument concerning your 
Procedures and of your interpretations is that they conform to estab- ir 
lished international standards, and that they are used in а wide variety ХУ 
of experience. In this connection the document, written by the,United ; 
N ations Sub-Commission on Statistical Sampling entitled, “The pregen- 
tation of sampling survey results" (UN Series С, No. 1, 1950) is of assist- — 
ance; likewise the “Manual on the Quality Control of Materials” (1951) 

. 


.* . 


b 


“you used it. If in direct testimony you say that you used a form 


818 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEM] 


and other recommended and standard practices of the American 
for Testing Materials, many of which have been adopted as stane 
other parts of the world. 

A formula can cause trouble unless you explain pretty expert 


calculate in advance the size of sample required, when the fact is t] 
you made a rough mental calculation and tempered it with judgi 
or that you made thé calculation years ago for similar work, and 
did not make a fresh detailed calculation for this job, or if you did 
one and then modified the answer to allow for some possible addit 
variance not fully represented in the formula, or to allow for some po 
heavy additional cost of inspection or of interviewing because of probabl 


comfortable. The trouble is that people not accustomed to formul 
not understand how one uses theory. 

Tf you say that a certain constant in your formula for the re 
sample-size represents your advance estimate of the variability 0 
material that you sampled, someone may accuse you of prejudgin 
answer. The fact is, however, that this advance estimate does not ї 
date in the slightest the standard error calculated from the results, 
cause any bias in the procedure. You must make this clear in you 
testimony. 5 

In practice sample-sizes are based on both theory and experience 
though you do not make a fresh calculation for every sample-de 
Theory is part of your experience, Without theory, experience he 
meaning. Theory and experience together produce scientific ad 
All this ean be made clear, I believe. 

Iproceed now to describe some of the other problems of exposition 
have arisen, and to offer some suggestions toward meeting them. 


IMPLICIT FAITH IN THE COMPLETE COVERAGE, 
AND IN THE 10 PER CENT SAMPLE 


ў A complete coverage, no matter how carried out, and even though 
incomplete (as complete counts too often are), has weight in evid 
А samlas unless it is a 10 per cent sample, has two strikes against 
start with. People who are not statisticians assume that the sheer 
a complete coverage will somehow cover up its incompleteness and the 
flaws in the method of measurement or in the interviewing. They 
that a judgment sample, if it is big enough, will doxthe same; and th 


t 
RESULTS OF SAMPLE SURVEYS AS LEGAL EVIDENCE з) 819 - 


willin addition overcome biases of the unknown probabilities of selection. 

A 10 per cent sample has almost equal standing with a complete count ` 
—maybe even better than 4 15 per cent sample. Why, or what 10 per 
cent, is hardly ever questioned, even by experts in quantitative subject- 
matter. " 

The statistician, in the explanation of his sampling procedure, faces 
such preconceived ideas. The precision of a small sample, selected and 
estimated by an efficient probability procedure, will require justifica- 
tion. It is a fact that the aerial plant in a sample of 1000 to 1500 tele- 
phone poles will provide all the precision that one can use for the estima- 
tion of the average over-all physical condition of the entire aerial plant 
which might be worth $200,000,000. But without very careful prepara- 
tion to dispel preconceived ideas about complete counts and 10 per cent 
samples, the statistician must be prepared to face an objection on the 
ground that a sample of only 1 part in 1000 is not admissible as evidence. 
The man who objects may, without knowing it, own stock in a woolen 
mill that purchases a million pounds of wool and pays duty on it on the 
basis of a sample that weighs from 60 to 100 ounces. 

The troubles that people have in understanding the power of a small 
sample are often tied up with failure to understand that it is the absolute 
size (n) of the sample, and not its proportion (n/N) to the whole, which 
determines the standard error of the result. The statistician must be pre- 
pared to meet the man who thinks that to reach: pregeribed precision 
in an estimated average rent, for example, a sample of dWvelling units 
from a big city must be bigger than the sample from a small city, because 
the big city is bigger. 

With careful preparation, you can dispel such misunderstandings in 
an entertaining way, and in simple language. Yoy fan explain with black 
and white beans the statistical principles used, and why it is that the 
Standard error of a sample is in practice hardly influenced at ‘all by the 
size of the lot that it was drawn from! You can portray vividly how a 
Pint jar of dried beans scooped wp from a larger mixture of black and 
white beans will provide an estimate of the proportion black in the mix- 
ture; and that a sample of less than a pint would probably be sufficient. 
You may then observe, and your listeners will agree, that the mixture 
Could as well be a carload of beans as a bushel of beans: the sample pro- 
Vides as good an estimate of the proportion black for the carload as it 
does for the bushel, provided that in both cases the mixture is ffofdughly 
mixed (an illustration borrowed frometestimony presented by Professor 
John W. Tukey). In practice*we accomplish thorough mixing with the 
use of a table of random numbers—a tool indispensable today in science. 

bs е 


г. Ы 
- 820 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


The nigh total failure of the size of a lot to have any influence on the 
standard error of a random sample drawn thereform is illustrated by 
charts in Eugene L, Grants book, Statistical Quality Control (MeGraw- 
Hill, 1946), page 345. Ineidentally, such citations will often help the 
-statistician's listeners to appreciate the fact that his methods are in 
universal use. One may usefully refer to the ever-expanding depend- 
ence of all kinds of scientific, industrial, agricultural,.and medical research 
on statistical theory;:the use of statistical methods to attain extreme 
precision in industrial production; the necessity for proper statistical 


design in the comparison of two industrial processes, machines, or medical | 


treatments, the growing reliance, in many parts of the world, on probabil- 
ity samples in social and economic studies that are to guide important 
decisions. 
- If you succeed in making your explanation clear, you will help your 
listeners to appreciate the contribution of modern statistical principles 
and techniques to scientific truth. They may be grateful, in the long run. 
Complex terms, flourished too freely, may alienate your listeners. Rely 
on patience, truth, and simple language. You can not afford to lose the 
attention of the examiner or judge; he is in position to protect truth and 
accuracy of statement. In cross-examination keep him on your side by 
your fairness and willingness to try to clear up.any questions concerned 
with your sample. : 


2 
FRECISION, ACCURACY, AND STANDARD ERROR 


Two concepts that are important to make clear in any presentation are 
precision and accuracy. Most statisticians probably think that they know 
what these words mean. I must confess that experience under the fire of 
cross-examination taught me some new angles to their meaning, and 
taught me the importance of explaining in advance the limitations of а 
standard error. a ў 

Precision is expressible by an international standard, viz., the standard 
error. It measures the average of the differences between a complete 
coverage and a long series of estimates formed from samples drawn from 
this complete coverage by a particular procedure of drawing, and prot- 
essed by a particular estimating formula. j 

Great precision or a small standard error attached to an estimate does 
not meg that this estimate is necessarily highly accurate or useful. It 
does mean that the results of a complete coverage would have been the 
same within a very narrow margin of difference, had the complete cover- 
age been,esrried out with the same: investigators, sharing the load pro 


portionately, and with the same care as they expended on the samples. 


є 


ULTS OF SAMPLE SURVEYS AS LEGAL EVIDENCE 821 


The so-called “expected value” of а sampling procedure (which of 
"course includes the formula for the estimate) is the same as the result 
ofan attempted complete coverage of the same frame that the samples 
_ ате to be drawn from (except for a possible bias in the formula, for 
| which an upper and innocuoys limit will be known). Both the complete 
coverage and the sample are subject to the same uncertainties and 
‘errors, such as inadequate supervision, nonresponse, wrong informa- 

on, missing information, failure of workers to cover their whole assign- 
ments, and to find all the people or all the items. The only difference is 
that the sample has sampling error, which is the one error that we are 
"best able to govern and to measure. The statistician measures the un- 
‘certainty introduced by sampling. The substantive expert judges 
whether the same operations would give accurate and useful informa- 
on if applied to the entire frame. 
The statistician will have drawn up the statistical procedures for the 
‘survey (the design of the sample, the instructions for drawing it, the 
instructions for tabulating the results and for computing the estimates 
and their standard errors). During the progress of the work, he should 
| be on hand as often and as long as necessary to know that the company 
B5 that retained him is following his instructions meticulously. He is then 
‘in a position to defend the validity of the standard error. If at any time 
he is not satisfied with the performance of the workerg, it is better for 
him to terminate at once his relationship with {Же client. He shoyld be 
sure that this responsibility is clear beforehand. h 

A statistician will occasionally be called upon to give his opinion in 
regard to procedures that another statistician has drawn up and testified 
to, or to give his interpretation of the results, including the standard 
error. After he has a chance to examine the procédures, he may testify, 
f he agrees, that they are one of manyepossible probability designs, and 
_ that IF they were followed metieulously, the results and the standard - 
| errors have certain interpretations, which he may give if called upon to 
do so. He may require, before he testifies, that certain caleulations be 
"carried out, to help him to examine the magnitudes of any biases that 
| he may suspect. He may require caleulations of skewness, if he suspects 
- that the estimate of the standard error is not sufficiently firm. The results 
of these investigations will guide his conclusions and his testimony con- 
cerning the precision of the results of the survey. He must not de.gatisfied 
B to testify to what he knows; he must explain how certain aspects ofethe 
" survey that he had no opportunity tô examine could possibly affect the 
results. . fe 
Even with familiarity with the job, and no matter how satisfied the 


V 


822 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 195 


statistician may be with the execution thereof, he can still not testify to 
the inherent usefulness of the result. Unfortunately, he has по standard. 
error of the usefulness of a result. Testimony on the usefulness of the re- 
sults will be left to the substantive expert—the engineer, the chemist, the 
physician, the population expert, the agricultural expert. The usefulness: 
of a result is not a problem of sampling; it deals rather with the method 
of measurement and with reasons why the method used will produce data — 
that will satisfy a particular need. The method would be the same whether 
the survey were a complete coverage or a sample. 

In eross-examination the opposition may tempt the statistician beyond 
the sphere of his competence. The statistician must try to answer all ' 
questions politely and simply, yet he must stay within the limitations of - 
his own ability and of the standard error. He certainly has a right to say 
he does not know the answer to a question that is beyond his compe- 
tence and beyond his direct knowledge. 

Although he can not testify to the inherent usefulness of the result, 
the statistician can certainly make it clear that he would not have associ- 
ated himself with the study had he not been sure in advance that it would 
be executed rigidly in conformance with his specifications, and that the 
methods of inspection, interviewing, and questioning, although beyond — 
his qualifications, would he satisfactory and produce useful data. He may 
do this without professing to be an expert in the subject-matter, as he 
may declare that he ha confidence in Mr. So and So (expert in the sub- 
ject-matter), who has testified, or will, concerning these things. 

This division of responsibility between the statistician and the expert: 
in the subject-matter should not be difficult to explain, but it is easy to _ 
forget to do it; and still easter later on in cross-examination to be lured 
across the border into the subject-matter and into trouble, 

The following excerpt represents a statistician’s attempt in direct | 
testimony to state what his job was; to put a limitation on his assign-- 
ment, and hence on what he could testify to in cross-examination. The 
case involved the use of samples of items of telephone plant, the aim - 


being to obtain a figure for the average over-all per cent physical condi- 
tion of the entire property. 


92 Doctor, for what purpose were you engaged by the Illinois Bell Tele- — 
phone Company? 

A. Iwas told that this company proposed to make a survey to determine the 

a Physical condition of its plant. They asked me to prescribe statistical 

procedures by which to select samples of items of plant for inspection, 

1 The Illinoia Commerce Commission, 

Illinois Bell Telephone Company in the 

here is testimony prepared in advance, 


Docket No, 39126, 1951, and Docket No. 41606, 1954: The — 
matter of the prcposed advance in rates. The passage printed аш 
and ів not,necessarily the same уб! for word in the record. 


6 


RESULTS OF SAMPLE SURVEYS AS LEGAL EVIDENCE 823 


such as poles, wire, cable, telephones, relays, central office equipment. 
The samples must determine within narrow limits of precision what re- 
sult would bé obtained for the average over-all per cent condition by a 
complete 100% inspection of all the items in all the classes of plant that 
were to be inspected, with the same inspectors, and with the same care 
as was exercised on the gamples, were such a thing possible. 

This assignment, carried with it the responsibility for prescribing 
the procedures for summarizing the results for each class of plant, once 
the code-values assigned by the inspectors yere translated into per- 
centages, and for combining the per cent conditions of the several classes 
of plant into the over-all average per cent conditions of all the classes 
that were to be inspected. A necessary part of the assignment was to 
provide procedures by which to calculate the standard error of the pre- 
cision of the result obtained for the over-all average per cent condition. 

My assignment did not include the responsibility for the procedures 
for inspecting any item, nor for the numerical values that translated the 
inspectors’ codes into percentages. Neither had I any responsibility for 
determining the weights of the various classes of property. These prob- 
lems are the same whether one uses sampling or not. These phases of the 
work have been described by Mr. Coxe (General Staff Engineer). 

Q. Does Company Exhibit No. 112 (Sampling Procedures for Drawing the 
Items of Property for Field Inspection) contain the procedures that you 
prescribed? 

A. Yes sir, it does. 


Later on came the following explanation of the standard error of the 
result: e " 


а е 
Q. What is your interpretation of the standard error of this study? 
A. The sampling precision of this study is expressed by the over-all stand- 
ard error, which turned out to be .19 per cent. This standard error is not, 
a matter of opinion nor of expert judgment, but is objective, as it is 
calculated by the laws of probability from the results themselves. The 
| interpretation of this standard error із simpl&: T may say with a high de- 
gree of assurance that the maximym uncertainty that onegmay attach 
to the over-all per cent condition because of the introduction of sampling, 
can not be, at the outside, more than three times the standard error. In 
other words, any uncertainty in the figure 74.5% (the final result) which 
can be attributed to the fact that the company used samples instead of a 
complete and total inspection of every item, with the same care as was 
exercised on the samples, were such a thing possible, can not exceed .57 
per cent. * У 


THE PERMANENCE OF THE STANDARD ERROR CONTRASTED WITH THE 
TEMPORAL CHARACTER OF ACCURACY AND USEFULNESS = 


In a probability sample (the only type of survey to be considered here) 


the precision is calculable from the xesults, as I mentioned inetbe opening 
paragraphs, In practice, the sizé of the sample will be sufficient to provide 


=? 


% 


824 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


а firm estimate of the standard error. This was the case in the excerpt 
above. One may say with a high degree of assurance that in a long series 
of repetitions of this sampling procedure, only about 2.3 per cent of the 
results would fall 2 standard errors above the result of the complete 
coverage, and about 2.3 per cent of the results would fall 2 standard errors 
below. Practically none of the long series would fall beyond 3 standard 
errors either way. | 4 

It is another thing, however, to say whether the complete coverage, 


were it possible, would produce useful information. The inherent ac- - 


curacy of the method of measurement (the interviewing, the question- 
naire, the method of inspection), and the usefulness of the information, 
whether obtained by a complete coverage or by a sample, is a matter for 
the substantive expert to testify to, as explained earlier. 

The main difference between a sample and a complete count is that 
the sample possesses an error of sampling. The statistician testifies in 
regard to this. A sufficient degree of precision is necessary for the useful- 
ness of sample results, but it does not guarantee their usefulness. 

The inherent accuracy and usefulness of the procedures of measure- 
ment will change from time to time as the substantive experts develop 
new concepts of the kind of information that they require to solve new 
and changing problems. Anyone who has followed the changing concepts 
of the characteristics of the labor force, or the changing concepts of a farm, 
or of-family-budget studies, or the changing concepts of the desirable 
characteristics of fibres and of textiles, will know that no definitions or 
methods of measurements stay fixed. 

In contrast, the standard errcr of a procedure of sampling remains 
fixed with time; likewise the interpretation of the standard error. The 
validity of the standard‘error does not depend on the economy or clevet- 
ness of the design of the sample. It depends only on careful execution 
and on the rigid use of probability methods in accordance with some pre- 
scribed statistical plan. For this reason, the standard error of a sampling 
procedure and its interpretation, remain valid, even though new ac- 
vances in theory point the way to more economical sampling procedures 
by which to obtain the same standard error. 


REFERENCES 

: There ie apparently no previous literature that deals with the presenta- 
tior of modern statistical procedures and their results in legal evidence. 
Fortunately, however, sampling has received attention from the legal 
standpoirit^m a paper by Frank R. Kennedy, who supplied copious re- 


Boy 


clusion, it is a pleasure 
chiefly from Mr. Melvin F. 


ACCURACY OF AGE REPORTING IN THE 1950 
UNITED STATES CENSUS 


Rosert J. MYERS 
Social Security Administration 


NE common human error is to round figures even thdugh precise 
O results might be desired or requested. This is, particularly evident 
in census returns where the age in integral years is sought. Particularly, 
does this arise for ages ending with the digit 0 and to a lesser extent 
frequently with digits 2, 5, and 8. This paper will investigate the extent 
of preference for certain digits of age in the 1950 United States census 
and will indicate the extent of improvement that has occurred since 
earlier censuses, as well às giving certain summary data for several 
other countries. The analysis will be carried out using the “blended” 
method. 


DESCRIPTION OF METHOD OF ANALYSIS 


One method for showing the degree of preference for certain digits 
of age consists of starting at a given age, say 20, and adding up the 
population for all ages ending in 0, all ending in 1, etc. Then the popu- 
lation at each digit is expressed as a percentage of the total population; 
any considerable deviation from 10 per cent would be taken as indica- 
tion of bias in age reporting for that particular digit. This procedure, 
however, docs not yield truly valid results since it is not proper to sim- 
ply add the overall populations at each digit starting at a particular 
age because then the “leading” digits naturally occur more frequently 
among the persons counted thar the “following” ones. 

_The “blended” method overcomes this objection by allowing each 
digit in turn to be the “initial” one. The ten separate results are then 
summed, “and a percentage distribution by digit is computed. The 
justification for this method is largely empirical, based on general 
reasoning and logic, with the further point that it produces proper 
results for smooth, life table data (i.e., shows no digit preference). 

As an example of how the “blended” method operates, when the 
count is started at age 20, the population considered at unit digit 0 is 
the Sum of those at ages 20, 30, 40, etc. For the nine cases when the 
count is started successively at ages 21 to 29, the population considered 
at digit 0 Begins with that at age 30 instead of age 20. Correspondingly, 
as to the population at digit 1, when the count is started in turn at 


2 Bee Robert J. Myers, “Errors and bias in thp reporting of ages in census data,” Transactions of 


the Actuarial Society of America, XLI 1 ü has De 
mographers, Bureau of the Census, aie Reproduced in Handbook of Statistical Method for 
> 


[1 


826 


) 


[1 
ACCURACY OF AGE REPORTING IN THE 1950 CENSUS — 827 


ages 20 and 21, included are ages 21, 31, 41, ete., while when the count 
is started at any of ages 22 to 29, included are ages 31, 41, etc.2 

The result of these caleulations then is a percentage distribution of 
the population at each of the 10 digits. If no heaping were present, 
each figure would be very close to 10 per cent. Conversely, any sizable 
deviation from 10 per cent indicates the presence of such inaccuracy. 
A relative index of,the amount of preference of age for any census dis- 
tribution can be obtained by summing up the absolute deviations from 
10 per cent in each case. 

Bachi? has suggested a somewhat preferable index, which amounts 
to half the previously described index? and which will hereafter be used 
as the index of heaping. This index has a certain significance since, as 
Bachi says, it “estimates the proportion of persons in the population 
who return their ages with an inaccurate unit digit and thus has the 
advantage of being more easily understood.” Bachi goes on to take the 
extreme case where all people report the same unit digit in which case 
the index would be 90 per cent indicating that 90 per cent of the people 
returned inaccurate unit digits. 

In actuality, Bachi's index more properly indicates the minimum © 
proportion of persons returning their ages with an inaccurate unit 
digit since certain errors may be self-cancelling. Thus, taking the com- 
mon case where digit 0 is over-reported, there may, be some persons 
truly having an age ending in 0 who reportesome other age; these 
persons are, of course, far more than offset by those who inaccurately 

_Teport themselves at an age ending in 0. In the extreme case, of course, 
there might be 10 per cent of the persons reported at each of the 10 
digits of age, which would yield an index of 0; yet it is theoretically 
conceivable that every person has returned his age with an inaccurate 
unit digit, but by chance there has been complete offsetting. At any 
tate, however, the use of the index ag developed by Bachi seems prefer- 
able because it does have a certain real meaning. 

. 


ANALYSIS OF U. 5. CENSUS DATA 


Table 1 shows the preference for digits of age in the total United 
States populatión for various censuses. Considerable heaping at digits 
0 and 5 occurred in the past, although there has been much improve- 
ment in | the past 70 years. Thus, in 1880, digit 0 showed a relative 


devais, YEY similar method of analysis, which in practice produces very much the same тегйн, was 
veloped independently by Roberto Bachi in “Mé&surement of the tendency to round off age returns, 


Proceedings of the Internati istical 

ional Statistical Ci „ Rome, 1953. D 
jan Or in other words, is based on only rias ets or overstated, digits (or conversely, only on the 
disliked, or understated, digits). > 


828 - AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


excess of 68 per cent over the normal proportion of 10 per cent, thoereti- 
cally to be expected, whereas by 1950 this excess amounted to only 12 
per cent. The heaping at digit 5, which in 1880 was half as large as 
that at digit 0, has by 1950 virtually vanished. Throughout the period 
a slight heaping at digit 8 has apparently been present. The greatest 
understatement has occurred for digit 1, thus indicating that the heap- 
ing at digit 0 seems to be due to digit 1 rather than digit 9. The index 
of preference has decreased steadily from more than 10 in 1880 to 
almost 2 in 1950. 


TABLE 1 


PREFERENCE FOR DIGITS OF AGE IN THE TOTAL CONTI- 
NENTAL UNITED STATES POPULATION FOR VARIOUS 
CENSUSES, POPULATION AT EACH DIGIT OF AGE 
AS PER CENT OF TOTAL POPULATION” 


Digit of VES 
Age 1880 1890 1900 1910 1920 1930 1940 1950» 
0 10.8 15.1 13.2 18.2 12.4 12.3 11.6 11.2 
1 ОЕ МВО ВО Уво 5 8.5 8.9 
2 9.4 9.7) 068 100 10.2. 10.3 10.4 10.2 
3 ВОИН ЗК 021 9/4. ол 9.6 9.7 
4 3:8) 1000705794 1904 9.0 от 9.7 
5 18.4 12:3 11.3 11.5 11.3 11.2 10.7 10.6 
6 VAY ови рл 9.6 от 9.0 9.6 9.8 
7 85059958 9:1 59.4 93 9.6 9,7 
8 10.2 10.4 10.2 10.7 10.6 10.5 10.3 10.2 
9 82 85> 9.7 94 96 9.8 10.0 10.1 
; i 
за 
Total 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 
ind dL ET S ЕР 5.8 -4B8 43. зо 22 


f Jed method of analysis is the “blended” method as described in the text, using starting ages 
о апа ending at age 99 in all cases. These percenta es, in effect, relate the reported population 
at fosa digit of age to the “true” population. 
M ba 20 per cent sample data, 
ve x is one-half the sum of the deviations from 10.0 per cent, each taken without regard to 
sign. These indexes, in effect, indicate the minimum net proportion of persons who return their ages with 


Table 2 indicates the variation in preference for digits of age by sex, 
race, and nativity for the 1950 census, For native-born whites, there is 
very little Preference for any particular digit of age although there is 
a small hegping at digit 0, and to some extent at digit 5. The index of 
preference is at the very low level of 13 for natiye-born white men 
although being somewhat higher fo? women of the corresponding group. 


i 


“ACCURACY OF AGE REPORTING IN THE 1950 CENSUS 


The foreign-born whites show greater inaccuracy in the reporting of — 


- ages than native-born whites, with the index of preference being about. 


twice as high, while for nonwhites, the index is about 4 times as high — 


as for native-born whites. i 
« TABLE 2 


PREFERENCE FOR DIGITS OF AGE BY RACE AND SEX IN — 


1950 CENSUS OF CONTINENTAL UNITED STATES 
POPULATION AT EACH DIGIT OF AGE AS PER 
CENT OF TOTAL POPULATION* 


Men Women 
E. n tive- Foreign- Noa Native- Foreign- Non: 
Jorn born shite born born white 
White — White  “™ White — White 
0 10.6 11.4 13.0 11.0 12,2 13.4 
1 9.2 8.4 7.5 9.0 8.0 T 
2 10.2 10.3 9.9 10.2 10.1 9.8 
3 9.8 9.7 8.9 9.8 9.5 8.7 
4 9.9 9.8 9.3 9.7 9.6 9.0 
5 10.3 11.1 11.5 10.5 11.4 11.5 
6 9.8 * 9,6 9.4 9.8 9.5 9.4 
7 9.9 9.7 9.4 *9.7 9.5 9.4 
8 10.0 9.9 10.5 10.2, 70.3 109: * 
9 10.1 10.0 10.6 10.0 9.9 18.7 
Total 99.8 99.9 100.0 99.9 100.0 99.9 
Index? 1:3 2.8 5.6 , 2.0 4.0 6.6 


eN 
1 * The method of analysis is the “blended” method as described ise the text, using starting ages 23 
to 32 and ending at age 99 in all cases. These percentages, in effect, felate ghe reported population at 
each digit of age to the “true” population. Based on 20 per cent sample data. $ 
P The index is one-half the sum of the deviations fom 10.0 per cent, each taken wfthout regard to 


sign. These indexes, in effect, indicate the mirñmum aet proportion of persons who return their ages - 


With an inaccurate unit digit, 


There is considerable evidence of greater accuracy of reporting in the 
1950 census since the indices for all categories are considerably lower 
than in previous censuses (see Table 3). In each category and for each 
census, the index of preference is lower for men than for women, with 
the relative differential being about 3 for native-born whites although 


- generally somewhat less than this for foreign-born whites agg, non- 
су 


whites, ; 3 
ANALYSIS OF CENSUS DATA FOR OTHER COUNTRIES. 
Some indication pf the reldtive accuracy of the reporting of digits 
of age in the 1950 United States census may be obtained by considering 


° 
эе 


е L] 


pale 


RACE AND SEX, CONTINENTAL UNITED STATES, 
1930, 1940, AND 1950 CENSUSES* 


a Census » 


Sex and Race 

» 1930 * 1940 1950^ 
——————————— 
Men, Native-born White 2.8 
Men, Foreign-born White 5.5 
Men, Non-white 12.0 


со w to 
-u- 
юк 
© 00 w 


Women, Native-born White 3.4 
Women, Foreign-born White 6.0 
Women, Non-white 12.8 


оо + ь2 
€ 0o 00 
оњ № 
ооо 


^ The method of analysis is the “blended” method as described іп the text, using starting ages 23 
to 82 (except for white population in 1940, for which ages 35 to 44 were, of necessity, used) and ending 
at age 99 in all cases. The index, in effect, indicates the minimum net proportion of persons who return 
their r agos with an inaccurate unit digit. 

» Based on 20 per cent sample data. 


the indices of preference in recent censuses in other countries where it 
would be expected that, because of a high degree of literacy, good re- 
porting would be obtained. Data by single years of age sufficient to 
make such an analysis are available for Australia (1947), Canada 
(1951), and Great Britain (1951), with the results being as follows: 


te Index of Preference 


Country x э 


CAL Men Women 


Aüstralia Н 1.2 1,4 
Canada Rae. s. 1.3 1.6 
Great Britain 1.4 1.1 
United States, Total 1.9 2.4 
United States, White 1.5 2.2 


As contrasted: with the other three countries, the index of preference 
for the United States is significantly higher, especially for women. 
However, when only the white population of the United States is con- 
sidered, the index for men compares quite favorably with those for the 
other countries, but for women the’ United, States index is significantly 
higher. In tot, it is pertinent to note that iu the other countries, there 


> ) 
830 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 
TABLE 3 
INDICES SHOWING PREFERENCE FOR DIGITS OF AGE, BY 
is relatively little difference in the accuracy of reporting of ages 85 | 


| 
| 


-— 


LI 
ACCURACY OF AGE REPORTING IN THE 1950 CENSUS 831 
between men and women, whereas for the United States women defi- 
nitely do not report as accurately as men. 

In both Canada and Great Britain, the only significant evidence of 
heaping is for digit 0, but this is of relatively minor significance reper- 
senting an overstatement of at most 10 per cent relatively. For Aus- 
tralia, the situation is somewhat different since there is only a slight 
indication of heaping at Uigit 0; in fact, there is evidence of a certain 
amount of heaping’ at digit 7. This peculiarity possibly arises because 
the census was taken in 1947, and the question on age was framed so 
as to ask for year of birth. Accordingly, a very sizable number of per- 
sons reported the “round” year, 1900 (the number shown at age 47 
being 10 per cent greater than the average at ages 46 and 48). 


SUMMARY AND CONCLUSIONS х 


The accuracy of the reporting of ages in the 1950 United States 
census has been in accord with the trend of steady improvement pre- 
yailing over the last 70 years. For certain groups, especially native- 
born white males, age reporting now, at least insofar as preference for 
digits of age is concerned, has reached almost as great accuracy as can 
ever by expected. There is, however, significant room for improvement 
in the nonwhite population. Furthermore, the reporting of ages by 
women is significantly "less accurate than for men despite the fact that 


in various foreign countries there is little difference between the sexes 
© 


as to accuracy of age reporting. ө 
‹ 
' . 
° 
ae 
e 
. e 
= e 
E 
x ' 
° е 9 
° 
e 
e 
LJ жө 
° 
es ° 
Ф LI 
e LÀ 


№ 


VALIDATION OF MORBIDITY SURVEY DATA BY 
COMPARISON WITH HOSPITAL RECORDS* 


Nepra B. BELLOC 
California State Department of Public Healtht D 


OUSEHOLD sample surveys are being used increasingly! to obtain in- 

formation on the*status of the health of the population. Physi- 
cians, and others who are to use many of the data so collected, are 
likely to raise the question: When lay interviewers, no matter how care- 
fully trained, question lay respondents on a subject as complex as 
illness, are the results sufficiently accurate to justify the continued use 
of this method of measuring morbidity? 

"Validation of the measures of morbidity by comparison with an 
independent criterion can be done only if the person who reported 
illness received some medical service. Reports of absence from school 
or work are not necessarily proof of illness, This means that alarge 
proportion of illnesses reported in household surveys are not subject to 
verification, since they are not medically attended. 

In almost all of the earlier illness surveys some attention was paid 
to the assessment of the accuracy of the diagnostic information ob- 
tained [7, 11, 13], In some cases the effort was directed at “improving” 
the diagnoses, given by respondents [16, 17] by the substitution of 
medical reports for those given in the surveys. Only in the National 
Health Survey [8, 14] were sufficient data presented to enable an evalu- 
ation of the extent of agreement between the family’s and the physi- 
cian's diagnoses. In that study, however, as in several of the others 
[2, 5, 6, 10], the diagnoses as given by the family were submitted to 
the physician for confirmation oz change, creating an obvious preju- 
dice in favor of agreement of the diagnosis. 

The method used in these surveys has been the checking of a survey 
report against a corresponding physician record or hospital record. The 
degree of agreement has been described as a percentage of records in 
which the diagnosis agreed out of those which were checked. Another 
EN cromo СА Were checked. / 


——— 


ти This investigation - R 
Health, U. 8. Public Health Service. Prec ПЕ grant (RG-1702) from the National Institutes o 


i 4 istical 

ЕЭШ уус D.C, December ao, np ПЭ Annual Meeting ofthe American Stata 
by Arthur Weisman. Bono the с=с. 2 preliminary work on this study which were done 
erg ав the Pittsburgh Arsenal Health District St dies, Canadian Sickness Survey, surveys 


Research Project of tho Health Таяныш Тын дева iro Hunterdon County and Baltimore, the Special 
Research Project. Plan of Greater NeW York, as well as the California Morbidity 


832 


è 


VALIDATION OF MORBIDITY SURVEY DATA r 833 


aspect of the validity of survey data, that of completeness of reporting 
of the fact of illness, has been virtually ignored. YR 

It is to be noted that, while it is valuable in some ways, the matching 
of individual records with criterion sources does not give a statistical 
measure of bias. Individual reports might differ considerably from the 
corresponding reports in tHe criterion source, and yet the errors might 


compensate in such a way that the over-all description of morbidity . 


given by the two sources could be identical. The statistician is inter- 
ested in a test of validity which will enable him to answer the question, 
“Does the measure of morbidity obtained by a household survey differ 
significantly from that obtained by reference to medical records?” 
Comparison with “criterion sources" ean be made without the impli- 
cation that these criteria are more accurate than the survey. In some 


cases, it is quite probable that the household survey reports are more - 


complete than the medical records. If the completeness of reporting in 
the household survey is to be measured, however, it is necessary to have 
a criterion source which is in itself extremely accurate. Otherwise, the 
household survey will produce a large number of over-reports solely 
because of the inadequacy of the criterion. 

This paper will present some of the results of the validation checks 
with records of hospitglization which were done in a survey undertaken 
by the California State Department of Public Health in San Jose in 
the spring of 1952 [1, 18]. The method used was to collect abstgacts of 
records from the hospitals serving the City of San Jose for all persons 
resident in the city, and then to locate in this file of abstracts the rec- 
ords of hospitalization for all persons in the household sample survey. 
The two sets of reports thus obtained, fróm the household survey and 
from hospital records, form the bases for the ‘computation of various 
measures. The net differences? between these two sets of statistics are 
examined in this report. Pah ey 

The measurement of hospitalized illness by securing records from 
hospitals for a sample of the population required intensive work with 
the hospitals, and would not be practical except in a limited area such 
аз San Jose. This study could not have been done without the patience, 
interest, and whole-hearted cooperation of administrative апа! ical 
record personnel in the participating hospitals. 
ee 


* Marks and Mauldin [12] have classified errors in surveys a8 of three types: sampling, Mésponse, . 


and processing. Since our survey reporig ме procefeed data, the error being studied here, while primar- 


ily response error, i 
‚ includes errors of р! oe г 
* Since there is some interest in the екеп, of agreement on “matched cases,” the method used in 


other surveys for validatiog, footnote references are made to analyses made by this method also- 


834 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


Data on Hospitalized Illness Obtained from Hospital Records 


For the seven months prior to the beginning of the survey, and for 
the five months of the survey, the four general hospitals and the State 
mental hospital in the area prepared abstracts of the records of pa- 
tients living in San Jose.* These abstracts included the name, age, sex, 
address, admission date, discharge date, days in hospital, surgery per- 
formed, and admission, and final diagnoses. The hospitals permitted 
checking of their records by members of the staff of the Project to in- 
sure a complete file of abstracts. These abstracts were filed by Soundex 
code of the surname, and the final diagnoses were coded according to 
the International Statistical Classification of Diseases, Injuries and 
Causes of Death. [9] 


Data on Hospitalized Illness Obtained in Household Survey 


In the initial interview, household respondents were asked whether 
any member of the household had been a patient in a hospital overnight 
or longer during the 12 months preceding the month of interview. If 
80, data were obtained on name of hospital, month of admission, length 
of stay in nights, operations performed, and diagnosis. 

Control cards for all families in the survey were made from the inter- 
view schedules, showing name, age, sex, and -address (previous ad- 
dresses were included where given). These cards, which did not contain 


any illness data, were flied according to the Soundex code of the sur- 
name, 


Matching of Reports of Hospitalization in Survey and Hospital Records 


For the initial matching operation, it was desired to locate all hos- 
pital records for ali individuals included in the survey. This was done 
by two clerks who went through the file of control cards, systematically 
searching the file of hospital abstracts for persons with matching oF 
similar names, ages, and addresses. Differences of two years or less in 
age were considered matching, and variations of one letter in name 
were disregarded. After the first check, a recheck discovered only three 
more matching cases. 

Hospital records for “matched” persons were compared with the 


information obtained in the household survey. In a large proportion of 
гары 


? 
us 3 елу. 21 me E of the hospitalizations of residents of San Jose occurred in hospitals outside 
js ae p A these were by persons who pected residing in another city at some time during 
s PON ome The remainder were in hospitals in nearby cities, including Veteranf 
ae Оа 1 se е Southern Pacific Hospjtal, University of California Hospital, ete. The 
n аан due le ts area may be different in duratiomand type from those cases in the immediate 
З given here do not represent measures of hospitalization for the population. 


T-——t — "Qe" үүн 


1 
\ 
| 
| 


ê 


- VALIDATION OF MORBIDITY SURVEY DATA 835 


eases, the period of hospitalization was clearly the same in both 
sources, in spite of certain discrepancies in reported date of admission, 
length of stay, or diagnosis. In some cases, however, there were hos- 
pital records of illness which had not been reported in the household 
survey (these we call possible *under-reports"); other cases were re- 
ported in'the household survey but not located in our file of hospital 
record abstracts (these we call possible “over-reports”). That is, under- 
and over-reporting of survey information are*with respect to the cri- 
terion. 

Possible over- and under-reports were subjected to further search. 
Variations in name were considered, and the telephone directory some- 
times verified the fact that two individuals (or families) of the same 
name lived at different addresses in San Jose. For some cases, the name 
of the nearest relative was secured from the hospital to serve as a fur- 
ther means of identification. In this process, five additional matched 
cases were discovered. 

Of some interest may be the degree to which a match was secured on 
the items used for check purposes. Table 1 summarizes the results of 


TABLE 1 


NUMBER OF MATCHED HOUSEHOLD SURVEY REPORTS AND 
HOSPITAL RECORDS AND NUMBER OF HOUSEHOLD SURVEY 
UNDER-REPORTS BY MATCHED ITEMS 


s i 
Matched Items Matched Records Under-Reports 
ox иш АЕ 8. 
Toran 9 249 39 
e 
Name, age, address 221. ° 33 
Name, age 20 • 2 
Name, address* e 6 e 2 
Age, address SES 2 2 
See text for definition of terms. е 


* Includes those in which age was not stated in the survey. 
the matching operation with respect to records of hospitalization in the 
five hospitals sabsequent to July 1, 1951. d 


Under-Reporting and Ovér-Reporting of the Event of Hospitalization in 
the Household Survey Pe 

A total of 279 periods of hospitalization were reported in the house- 
hold survey in the five hospitals with discharge date after July 1, 1951.° 


ospitalizations in the year 
95, or 25 per cent, were 


* Persons in the city gegments of the household survey reported 403 h 
Preceding the month of interview. Of the 374 in tHe five hospitale in the area, 


836 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


Of these, 249, or 89 per cent, were matched with hospital records. Of 
the 30 (11 per cent) “over-reports,” i.e., reports in the household survey 
which were not matched with hospital records, 20 were identifiable at 
the hospital which was named. In eleven of these cases, the individuals 
had reported as during the survey period a hospitalization which actu- 
ally occurred as long as one year earlier, and ingeven cases, hdspitaliza- 
tion had been reported in the household interview as overnight or 
longer when the hospital record showed discharge on the same day as 
admission. Table 2 shows a summary of over- and under-reporting. 


TABLE 2 
REPORTS OF HOSPITALIZATION IN HOUSEHOLD SURVEY AND 
IN HOSPITAL RECORDS FOR RESIDENTS OF SAN JOSE IN 
SAMPLE, JULY 1, 1951-MAY 31, 1952* 
WITH NUMBER AND PER CENT MATCHED AND WITH 
CLASSIFICATION OF THOSE NOT MATCHED 


Category of Report Number Per cent 

Reports of hospitalization in household survey 279 100.0 

Matched with hospital records 249 89.2 

‘Not matched with hospital records (over-reports) 30 10.8 

Identified at hospital 20 7.2 

Stay prior to check period › 11 3.9 

Stay not overnight 7 2.5 
Record of this person; but not of this hospital- 

ization 2 ae 

Not identified at hospital 10 3.6 

Hospitalizations in hospital records for survey popu- 

lation g 288 100.0 

Matched with household sürvey reports 249 86.5 

Not matched with survey reports (under-reports) $9 13.5 
Multiple admissions, not all admissions reported 

in survey * 16 5.6 

Reported in survey, but not during check,period 8 2.8 

Admissions to state mental hospital 4 1.4 

Other types of under-reports 11 3.8 


* Check period varied from 7 to 11 months for the different subgroups in thë sample. 


A total of 288 periods of hospitalization were shown in hospital rec- 
ords for persgns in the household survey in the period subject to check. 


for discharges prior to July 1, 1951, the beginning point of the time when hospital di 

Я Ж 5 [er d ospital discharge records were 

per A E X кта covered by the теройв subject to check varied from 7 to 11 months, 

Went eee pia ich the initial interview was taken. * While episodes of hospitalization in the 

RUM ue Lure accurately than those which осештей nearly а year before, differences in 

b е 7 months were not significant. It is believed that no appregjable error was introduced 
y the use of the varying period subject to check. ^ T 


| 


E ô 
VALIDATION OF MORBIDITY SURVEY DATA Ec 837 


Of these, 39, or 14 per cent, were not reported by the respondents in- 
the household survey. In 40 per cent of these cases, persons in the sur- 
vey had more than one admission in the period, and reported at least 
one other. In another 20 per cent, the period of hospitalization was 
reported, but the date given was such that it did not fall within the 
period subject to check., There was failure to report four stays in the 
State mental hospital. (Two such hospitalizations were reported and 
matched among the 249 appearing in both survey and records.) 


Comparison of Common Measures of Hospitalization as Derived from. 
Household Sample Survey and from Hospital Records 

The 279 periods of hospitalization reported in the household survey 
and the 288 periods disclosed in hospital records for the same popula- 
tion and period of time form the basis for the computation of a number 
of common measures of hospital utilization shown in Table 3. 


TABLE 3 


MEASURES OF HOSPITALIZATION FROM SAN JOSE HOUSEHOLD 
SURVEY AND FROM HOSPITAL RECORDS 


From From 
й Survey | Hospital | Ratio 
Measuré Reports | Records B/A 
(А) в *(B) ; 
|. | —— шшк, 
Admissions per 1000 persons per year* 65.5 67.9 1.04 
Days of hospitalization per person per year .609 .655 1.08 
Average length of stay per period of hos- 
pitalization in days °9.1 9.5 1.04 
Per cent of admissions with surgery 43.6 7 44.4 1.02 
AS TERES 
based were in five hospitals only, 


Note: Because the hospitalizations upon which these rates were 2 
these should not be considered to represent true rates of hospitalisation for the population covered. 


* 
Z Admissions in period covered 
2 Person-months covered by survey? 


X 12,000. , 


When respondents in a household sample survey were asked about 
periods of hospitalization in the year preceding the survey, the informa- 
tion which they gave yielded an admission rate of 66 рег 1000 persons 
in five specified hospitals. When the records of these five hospitals 
were searched for the names of the persons in the houseñoldssurvey, 
records were disclosed which yielded an admission rate of 68 рег°1000 
persons, The difference between these two rates is not significant at the . 


* In all but two of these the unreportel period of hospitalization was for the same condition as the _ 


Period which was reported. * SEE 


хло» 


888 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


five per cent level. (In all tests of significance in this paper, the five 
per cent level was used.) 

Similarly, the days of hospitalization per person per year, the average 
length of stay, and the per cent of cases with surgery, were all slightly 
higher when obtained from hospital records than when obtained from 
the household survey. None of these differences was statistically signifi- 
cant.” ; 

The difference in the average length of stay, 9.1 from survey data 
and 9.5 from hospital records, was accounted for by several long peri- 
ods of hospitalization which were not reported in the survey. When the 
239 cases in which length of stay was reported in both sources are com- 
pared, the averages become 9.2 in the survey and 8.6 from hospital 
records.* There was some indication in the data that persons in the 
survey tended to over-report longer stays to a greater extent than 
shorter stays. However, when the differences between reports from 
household survey and hospital records for stays of 15 days or less were 
compared with the differences for stays of 16 days or longer, using à 
t-test which takes into account the differences in variance at the two 
ends of the scale, this tendency proved to be not significant at the five 
per cent level [3]. | 

Another way in which we may test the usefulness of household sur- 
vey data in the reporting of hospitalization is by comparing the dis- 
tributipns of some of the items as reported in the household survey 
with the distributions obtained by reference to hospital records. Such 
comparisons have been made for the month of admission to hospital, 
length of stay, surgical procedure, and diagnosis. 

: "Table 4 presents the month of admission of the periods of hospitaliza- 
tion as reported by, the household survey and as obtained by the check 
of hospital records. Using the chj-square test, the difference between 
these distributions was not significant.? 

_ OF Some concern to medical care plans and hospital administrators 
is the distribution of cases by length bf stay and the proportion of 
total days which are accounted for by stays of various lengths. The 
comparison of the household survey reports and hospital records as to 


7 Bince some stays were over 200 days, the standard A ЕД et 
Verbo greater than the n , the stan deviations of the distributions of stays 
is to be npted that the household surve: i TOYS Tope о 
inf ion obtain р urvey questions used should yield information differing fro! 
pay ba ed from hospitals regarding length of stay. The “number of nights” reported in the 
pitalization gh. Er pep will always be equal to or one day less than the number of days of hos- 
? There 2 еер records. The median Wngth of stay was 4 in both distributions. 
оё one mont! ioe {ае reported month of admission in 80 per cent of the matched cases. 


Discrepancies pandi ‘ 
теша ve per вш}. im in fifteen per cent of the cases, and of more than one month in the 
LI v. 


ey n 


| 


g 


VALIDATION OF MORBIDITY SURVEY DATA 839 


length of stay is shown in Table 5. The distribution of cases shown in 
the first column of Table 5 is not significantly different from the dis- 
tribution shown in the second column.!? The degree of precision which 
is desired in the reporting of total days of hospitalization would de- 
pend, of course, upon the uses to which the data are to be put. For 


Й е 
е ТАВІЕ 4 


279 REPORTS OF HOSPITALIZATION FROM THE SAN JOSE 
HOUSEHOLD SAMPLE SURVEY AND 288 FOR THE 
SAME POPULATION FROM HOSPITAL RECORDS, 

BY MONTH OF ADMISSION 


Hospital 
Month of Admission esed Beans 
ToraL 279 288 
Prior to July, 1951 12 n = 
July 31 ae 
August 33 38 
September . 30 35 
October 23 22 
November 2 35 = 
December * 29 26 
o 
LÀ 
January, 1952 31 zi A 
February 19 16 
March 19 17 
April e 7 7 
May 97. 17. 7 
at 
Not specified or not available | 3 " T 
Fr LEM Ne PTUMI 


most practical purposes, however, $t would seem that the distribution 
of days reported by the уб methods are equally useful. It is to be 
noted, for example, that in the household survey 71 per cent of the total 
days of hospitalization were reported in stays of 60 days or less, while 
‚ the corresponding figure by hospital records was 68 рег cent. * 
When individual household survey reports are compared with the 
corresponding hospital records, it is found that among 239 records for 
which the item of length of stay was complete in both sources, there 
жаз exact agreement in 127 cases, or 53 per cent. In 65 cases, or 27 
per cent, the survey repoft was greater than the hospital record, and 
—s— 


10 When grouped. jnto 16 categories, chi-square =12. 


» 


840 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


in 47 cases, or 20 per cent, the survey report was less than the hos- 
pital record. The differenee between the number of cases reported in 
the survey as more than and as less than the hospital record is not sig- 
nificant at the five per cent level, using the sign test [4]. It is to be 
remembered, however, that the definitions of length of stay differed, 


TABLE 5 x 


DISTRIBUTIONS OF PERIODS OF HOSPITALIZATION AND DAYS 
OF HOSPITALIZATION AS SHOWN IN THE SAN JOSE 
HOUSEHOLD SURVEY AND IN HOSPITAL RECORDS, 

BY LENGTH OF STAY IN DAYS 


а Ranken Cumulative Percentages 
de pug E Rn. De Periods Days* 
Survey Hospital | Survey Hospital | Survey Hospital | Survey Hospital 
Toran 279 288 2484 2672 
1 34 30 34 30 12.5 10.7 1.4 14 
2 31 46 62 92 23.9 27.0 3.9 4.6 
3 40 32 120 96 38.0 — 38.4 8.7 8.2 
4 40 37 160 148 | 53.3 51.6 15.1 18.7 
5 29 E 145 155 | 64.0 62.6 21.0 19.5 
6 13 14 78 84 | 68.8 67.6 24.1 22.0 
7 n 14 DIT 987 | 72.8 » 72.6 27.2 26.3 
8 LADEN 64 24 75.1 13.7 290.8 27.2 
92 .9 16 81 144 79.0 79.4 33.1 32.6 
10 8 10 80 10 | 82.0 82.9 36.8 36.8 
М, 4 6 44 66 | 83.5 85.1 38.0 38.8 
12 4 4 48 48 | 84.9 86.5 40.0 40.6 
13 2 4 26 52 85.7 87.9 41.0 42.6 
ч 5 2 ‚70 28 87.5 88.6 49.8 43.6 
A 
WA 13, 12's). 233 эц | 923 929 | 53.2 51.6 
es 10 7 263 159 96.0 95.4 63.8 57.6 
6 Әта; 7 180 * 277 97.8 979 71.1 67.9 
Over 6 6 719 „ 857. | 100.0 100.0 100.0 100.0 
Not stated 7 7 = L4 


* in the household survey reference was made to “nights in hospital.” 


; and that a report in the survey might be accurately reported as one 
day less than the hospital record. If half of the household survey re- 
ports which were only one day less than the corresponding hospital 
record азе censidered as matching, the data show a tendency, which 
ів Statistically significant at the five „рег cent level using the sign test, 
for Over-statement in the household survey" 

One item of information gathered in the survey concerned the opera- 


и With this assumption, cases in which there v -TPR in which 
the survey report was less than the hospital таса ааа. D e menn 


: 


S [4 
VALIDATION OF MORBIDITY SURVEY DATA n 84 
tion(s) performed during a stay in the hospital. These items were - 
coded according to the first digit of the operation code in the Standard 
Nomenclature [15]. Table 6 gives a summary of the reports from the 
two sources. Pun n 
TABLE 6 


DISTRIBUTION OF SURGICAL PROCEDURES IN 279 REPORTS 
OF HOSPITALIZATION FROM THE 8AN JOSE HOUSEHOLD 
SURVEY AND 288 FROM THE HOSPITAL RECORDS 


вое ОН Survey Hospital 
Surgical Procedure Reports Records 
Toran 279 288 
Surgery not stated or record not available p 9 
Without surgery 158 155 
With surgery 121 124* 
Incision 6 7 
Excision a 79 92 
Amputation 2 1 
Introduction ө i 1 0 
Endoscopy У 1% 1 
Repair $13 48 
Destruction 3 1 
Suture 1 1 
Manipulation 1 3 
Not classifiable above 2 14 0 


* Twelve hospital records showed two procedures and oneShowed jhree. In these cases the pro- 


cedure which matched the household survey report was counted. 
o . 


In coding these procedures such*terms as “sinus operation,” “oper- 

ated anorectal,” “kidney operation," and “general surgical work” ap- 
peared in more than ten per cent of the household survey reports. 
These terms were not classifiable in the system used above. However, 
descriptions which probably referred to only one procedure, such as 
“hernia operation” (repair), were given the appropriate code. — 

It is apparent from inspection of Table 6 that the large group in the 
“Not classifiable” category for the household survey repotts makes the 
picture quite different from that shown by the hos ital records. This 
difference is statistically significant. Apparently, then, household sue 


1 The distributions in Table 6 werefkrouped into four categories, Incision, Excision, з 
Other, and the chi-squaré test was applied. - 8 


w 
842 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 - 


vey reports do not give as specific descriptions of surgical procedures 
as can be secured from hospital records. 

Diagnoses given in the household survey and those on the hospital 
records were coded according to the 3-digit codes of the International 
Statistical Classification of Diseases, Injuries and Causes of Death. In 
both types of records, the diagnoses were “coded in order of mention 
except that injuries were given precedence over other, conditions. In five 
per cent of the survey reports there was more than one diagnosis, while 
there were multiple diagnoses in nearly nineteen per cent of the hospital 
records. i 

When the primary diagnoses are distributed according to a 100- 
group category“ system, there are, of course, many categories in which - 
there аге few cases. These distributions are shown in Table 7. In order 
to test the significance of the difference here, the frequencies were 
grouped into 24 categories such that in no category were there less 
than five cases in the hospital records. Differences between the two 
distributions were not significant. A further consolidation of the dis- 
tributions into eleven groups increased the degree of correspondence." 

It would appear that for most practical purposes, a description of the 
diagnoses of hospitalized illness to be obtained from a household survey 
like the one conducted in San Jose will be as wseful as one obtained 
through reference to hospital records. 

a 
» т SUMMARY 

In summary, most of the medical record sources which are available 
for validation checks on household ‘survey reports of illness do not pro- 
vide an opportunity for a comprehensive check on both the over- and 
under-reporting of illness? Only with respect to hospitalized illness was 
it possible to study the net error in reporting. 

In this study, reports of hospitalization during a preceding period 
ranging from 7 to 11 months obtained by interview of households in ' 


sources, In 133 of these (which included episiotomies and stitches incident to normal deliveries) there 
was а report of no surgery on both records. On one record the survey report was “Broke right ankle. 
Surgery: None"; while the hospital record showed “Fracture of the int. and est. trimalleolar of right 
ankle, Surgery: Reduction of fractures.” Here, obviously the respondent's concept of what constitutes 
hed differed from that generally understood. In two other cases in which the survey reported surgery 
le the hospital record showed none, it appears that the hospital record may have been in error. (A) 
a is 1 "Ogfeomyelitis in left arm, operated on left arm for drainage." Hospital record: "Osteo- 
myel на radius, Surgery: No.” (B) Survey report: “Adhesions, Female trouble due to hysterec= 

eer ee removed.” Hospital record: “Adhesions cecum and ascending colon. Rt. Ovarian cyst | 

La varicorities. Adhesions bands intestine two to loyer pelvic area, Surgery: No." 4 
casen there was a report of surgery in both sources. In 88 of these, or eighty-five per cent, 


there was (кише ^ to the surgical procedure when classified into ten categories according to the 


и See footnote on Table 7, ^ we 


of the cases, At the-100-group level, this agreement inereased to 76 are 
ranged into 15 groups, the agreement was 85 per cent, 36 per cent, and when the data are 

К ^ € 
ә, 


a 
VALIDATION OF MORBIDITY SURVEY DATA 5 843 


San Jose, California, were checked against hospital records in five 
hospitals. Records were matched for 249 periods of hospitalization. 
Thirty reports in the survey could not be matched with hospital rec- 
ords. These over-reports included hospitalization which was not over- 
night, stays which occurred as long as one year earlier than the study 
period, as ‘well as ten whicĂ could not be identified at the named hos- 
pital. Thirty-nine periods of hospitalization were not reported by the 
respondents in the survey. Nearly half of these under-reports were for 
persons with multiple admissions, who reported at least one other 
hospitalization. Four were unreported admissions to the State mental 
hospital. 

Admission rates based on survey reports did not differ significantly 
from those based on hospital records for the same population. Similarly, 
days of hospitalization per person per year, average length of stay per 
period of hospitalization, and per cent of admissions with surgery were 
calculated accurately from household survey data. Я 

Distributions of admissions from household survey reports of hos- 


TABLE 7 


DISTRIBUTION OF SOLE OR PRIMARY DIAGNOSES IN 279 
REPORTS OF HOSPITALIZATION FROM THE SAN 
JOSE HOUSEHOLD SURVEY AND 281* FROM 

THE HOSPITAL RECORDS 
а 
О 

Number of ` Per cent of 


Isct Hospitalizations Total 
Cod Diagnostic Category? Е 
т. $ Survey Hospital | Survey Hospital 
Reporta Records | Reports Records 
Toran 20 зв» | 100 100 
001-188 | Infective and parasitic diseases a 10. 7 03.6 2.5 
Tuberculosis of respiratory system 2 1 
Food poisoning $ 1 0 
Acute poliomyelitis, infectious encephalitis 
and late effects 4 4 
Other infective and parasitic diseases 3 2 
200-820, | Psychoneurosea, mental disorder and ill-defined 
858, 780, | nervous conditions в 10 2.2 2.6 
781,790, Mental, psychoneurotic and personality 
71 disorders 6 9. 
Epilepsy ОСЕ 
8 » em 
70-889 | Diseases of eyes 2 5 NERIS 
Other inflammation of eye è 0 5 
Cataract ә 2 1 
Other diseases of eye ° o so ав, 
e 
$68, Rheumatic fe iti 
eter, arthritis, muscular rheuma- 
400-410, | tiem and sciatica ; ce з * 2 d 
740-727 | Rheumatic fever and chronic rheumatic heart 
disease e 1 0 д 
Arthritis, not elsewhereelassified 2 cat ee 


И аашаа nu р S E ees 
P = 4 i 


«— 3c 


cud 


550-687, 
785 


690-689, 
786 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


TABLE 7—(continued) 


Number of Per cent of 
Hospitalizations Total 
Diagnostic Category? = = Ҥ э 
Survey Hospital| Survey Hospital 
Reports Records | Reports Records 
Diseases of circulation and symptoms referable Т 
to it 24 32 8.6 7.8 
Arteriosclerotic hqart and coronary disease 4 7 
Other diseases of heart А 6 2 
Hypertensive disease 4 1 
Diseases of arteries 3 2 
Varicose veins of lower extremities 2 2 
Haemorrhoids 4 6 
Other diseases of circulatory system 0 1 
Symptoms referable to cardiovascular and 
lymphatic system 1 1 
Colds, influenza and acute respiratory infec- 
tone ч 1 2 4 7 
Other acute upper-respiratory infections 0 1 
Influenza H 1 
Other respiratory diseases and symptoms #1 17 7.5 6.0 
Pneumonia 6 5 
Hypertrophy of tonsils and adeno:ds 13 8 
Chronic sinusitis and nasal diseases 1 2 
Pleurisy, empyema and lung abscess 0 1 
Symptoms referable to respiratory system 1 1 
Disorders of üpper gastyo-intestinal tract 8 8 2.9 2.8 
Ulcer ^f stomach 5 1 
Ulcer of duodenum 0 5 
Other disorders of stomach and duodenum 2 2 
Symptoms referable to ‘upper gastro-intes- 
tinal tract » X 0 
> 
Disorders of lower gootro-intestinal tract 30 33 10.8 11.7 
à ix "e 12 10 
Hernia and intestinal obstruction 7 8 
Gastro-enteritis and colitis , z 2 7 
Cholelithiasis and cholecystitis, 3 4 
Other disorders of digestive system 6 4 
Disorders of genito-urinary system and compli- 
cations of childbearing 97 97 34.8 34.5 
Nephritis 1 0 
Caleuli of urinary system 1 0 
Other diseases of urinary system 3 [Xd 
Eu lasia of prostate 0 1 
er diseases of male genital organs 2 0 
aster of breast 2 0 
orders of menstruation and menopause 0 2 7 
Other diseases of female genital organs 11 10 
Complications of pregnancy, childbirth 
and puerperium ar 78 
е referable to genito-urinary sys- 
lo 


2 


2 


© 
VALIDATION OF MORBIDITY SURVEY DATA 
TABLE 7—(continued) 


Number of 
1 Hospitalizations 
DO Diagnostic Category? I———————À 
Codes Survey Hospital 
i Reports Records 
690-718, | Diseases of skin and skeletal system, malforma- 3 ЭУР, 
730-759 | tions v 10, 14 3.0 5.0 
* Other infections of skin and subcutaneous o i 
tissue 0 2 Y 
Other disenses of skin and subcutaneous c E 
tissue 1 1 
Osteomyelitis and periostitis 1 1 
Other diseases of bone 2 1 iit 
Other diseases of joints except ankylosis 5 2 
Other acquired musculoskeletal deformities 1 3- 
Congenital malformations 0 4 
140-899, | Other diseases and symptoms 4 49 16.8 17.4 
330-834, | Neoplasms 30 28 
840-352, | — Asthma 1 0 
354-862, Diabetes mellitus 3 3 
804-309, | — Ansemias 0 1 
760-776, Other allergic, endocrine, metabolic and 
787-789, blood diseases 1 3 
798-795 Vascular diseases affecting central nervous 
system 3 4 \ 
Other diseases of central nervous aystem 1 3 
Diseases peculiar to early infancy 2 2 
Symptoms referable to limbs and bask 0 1 
Other and ill-defined symptoms and condi- e 
tions t ie * 
800-999 | Injuries 20 17 7.2 6.0 
Fractures 7 8 
Dislocations, sprains and strains у 3 0 
Interval injury of chest, abdomen, pelvis | y 
and head injury without fracture ly 1 
Lacerations and open wounds м ud 
Burns 1 1 
Other and unspecified effects of external e 
causes е м 7 6 


See reference [9]. Groups used are from “Drafts of Five Special Condensations and Expansions 
of the Internat anal oa Classification of Diseases, Injuries, and Causes of Death to Provide for 
Presentation in Convenient Form of Statistics of Sickness Surveys, Sickness Absenteeism, and Hospital 
Diagnosis,” received from I. M. Moriyama, Secretary, U. S. Committee of Vital and Health Statistics, 
with letter dated 1-6-53, RR ES e 

* Categories not iflcluded in which there were no reported periods of hospitalization. — 

* Excludes 7 cases for which diagnoses were not available. 


pitalizations by month of admission, length of stay, and Шен Doa 
similar to the distributions obtained from hospital records. Whether or 
Dot surgery was performed was reported accurately in. the household Р 
Survey, but the description of surgical procedure was notras precise as 
that obtained from hospital Tecords. ! 


846 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


This study shows that reports of hospitalization obtained in house- 
hold sample surveys are sufficiently accurate to be used for many pur- 
poses in lieu of hospital record data. 


REFERENCES 
Ш Breslow, Lester, “California morbidity research project— Conduct of the 
San Jose survey," to be published. i 


[2] Collins, Selwyn D., “Çauses of illness in 9,000 families based on nation- 
wide periodic canvasses, 1928-1931,” Public Health Reports, 48 (1933), 
283-308. 

[8] Dixon, Wilfrid J., and Massey, Frank J., Jr., Introduction to Statistical An- 
alysis, New York, McGraw-Hill, (1951), pp. 104-5. 

[4] » Pp. 247-252. 

[5] Downes, Jean, and Collins, Selwyn D., “A study of illness among families 
in the Eastern Health District of Baltimore,” Milbank Memorial Fund 
Quarterly, 18 (1940), 5-26. 

[6] Downes, Jean, Collins, Selwyn D., and Jackson, E. H., “Medical care 
among males and females at specific ages—Eastern Health District of Balti- 
more, 1938-1943," Milbank Memorial Fund Quarterly, 29 (1951), 5-30. 

[7] Falk, I. S., Klem, Margaret C., and Sinai, N., The Incidence of Illness and 
the Receipt and Costs of Medical Care Among Representative Families, 
Chicago, University of Chicago Press, (1933), p. 72. 

[8] Holland, Dorothy F., and Altenderfer, Marion E., Sickness in a Metropolitan 
Community— The Results of a Health Survey of New York City, U. S. Public 
Health Service, multilithed, 161 pp. » 

[9] International Statistical Classification of Diseases, I njuries and Causes of 
Death, Sixth Revision df the International List of Diseases and Causes of 
Death, Geneva, World Health Organization, (1949), Two vols. 

[10] Jackson, Elizabeth H., “Duration of disabling acute illness among employed 
males and females—Eastern Health District of Baltimore, 1938-1943,” 
Milbank Memorial Fund Quarterly, 29 (1951), 294-330. 

[11] Kennedy, M. Eileen, Administration and methods of enumeration of the 
riw survey im Alberta," Canadian Journal of Public Health, 44 (1953), 

ER a 

[12] Marks, Eli S., and Mauldin, W., Parker, "Response errors in census re- 
search," Journal of the American Statistical Association, 45 (1950), 424-38. 

[13] Peart, A. F. W., “Canada’s sickness survey—Review of methods," Cana- 
dian Journal of Public Health, 43 (1952), 401-14. 

[14] Perrott, George St. J., Tibbitts, Clark, and Britten, Rollo H., “The National 
health survey,” Public Health Reports, 54 (1939), 1663-87. 

[15] Standard Nomenclature of Diseases and Operations, American Medical Asso- 
ciation, New York, Blakiston Co. (1952), p. 518. 

[16] Stecker, Margaret L., Some Rec 


ent Morbidity Data, New York, Metropolitan 
Life Insur®nce Co. (1919), p. 12. сас Ида $ 9р 


[17] Sydenstricker, Edgar, “А study of i i i : 
n Ў у of illness in a general population group, 
Public Health Reports, 41 (1926), 2000-88. | pu ; 
[18] Weissmas Arthur, "California morbidity research project,” American Jour- 
nal of Public Health, 42 (1952), 711-16, * 
> 


эл 


BUSINESS FAILURES: ANOTHER EXAMPLE OF THE 
ANALYSIS OF FAILURE DATA 


К. 8. Lomax 
University of Manchester 


The analyses of failure data given by Davis [1] all involve 
? essentially lponstant or increasing conditional probabilities 
of failure. For business failures, however, it is reasonable to 
expect monotonically decreasings conditional probabilities. 
An analysis of data on failures of four types of business in 
Poughkeepsie, New York, from 1844 to 1926 [2] confirms this 
expectation. The conditional probabilities of failure for these 
four series are well described by both exponential and hyper- 

bolic functions. 


нь interesting and stimulating paper by D. J. Davis [1] on failure 
data is most suggestive to the economist. 
Broadly, Davis analyzes three types of failure theory: 


(a) The normal theory of failure, in which the failure probability 
density function is Gaussian. 

(b) Human mortality, characterized by rapid increase of the con- 
ditional density function aftér middle-age. 

(c) Exponential theory of failure, in which the conditional density 
function is constant. 5 я 


° 
Та (a) uniformly and in (b) after the very éhrly years of life the con- 
ditional density function of failure probability with time is strictly 
monotonic increasing. In (c) it is constant. 

The economist immediately thinks of business failures in which itis 
reasonable to expect the conditional dehsity function strictly to de- 
crease monotonically, The purpose of this nété, then, is to draw atten- 
tion to a fourth category of failure theory: ° 

(d) Business mortality. It is fairly well established that with most 

types of business the early years are the most difficult. It is then 
that mortality is highest. The longer à business survives, gener- 
ally, other things being equal, the smaller becomes the prob- 
ability o$ failure. ? 


Take, for example, the useful and comprehensive data compiled by 
R. G. and A. R. Hutchinson and Mabel Newcomer [2]. Their, Table I, 
Showing the length of life for business enterprises established in Rough- 
keepsie between 1844 and,1926, tan serve as basis for caleulation of 
F(0), for different values of t, where eo 


F(t) =curmulative probability of failure in the interval (0, t). 
847 


E Bay 
e . 


848 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


The results of these calculations, omitting wholesale businesses since 
the sample there was small in comparison with the other categories, are 
shown in Table 1. 


TABLE 1 


CUMULATIVE PROBABILITY OF RUSINESS FAILURE 
; IN POUGHKEEPSIE, 184421926 | 


эс 


a 
i ^ i 

Р | Retail oe Craft Service 
1 0.296 0.281 0.307 0,327 
2 0.438 0.346 0.454 0.457 
3 0.532 0.469 0.551 0.551 
4 ' 0.594 0.547 0.607 0.618 
5 0.643 0.602 0.660 0.669 
6 0.684 0.655 0.697 0.708 
7 0.715 0.678 0.727 0.743 
8 0.741 0.702 0.753 0.769 
9 0.762 0.726 0.772 0.792 
10 0.782 0.746 0.791 0.812 


Source: Calculated from Hutchinson, Hutchinson, and Newcomer [2], 
In all these cases the graph of F(t) takes the shape shown in Figure 1. 


Now, if |, | * 
AU probability density function of failure times , 
= probability of failure in infinitesimal interval (1, t+dt), 


. 
BUSINESS FAILURES: ANALYSIS OF FAILURE DATA 849 


then f(t) = Е'(0) and two methods are available for estimation of f(). 
from the above data. One is to measure the slopes of the F(f) graphs 
at the different values of t. The other is to use the difference formula . 


F'(t) = AFi) — à3A*F(0 3A F() —--.. 
The former method seems*to be the more satisfactory here, and the А 
results of applying,it to the data of Table 1 are shown in Table 2. 


TABLE 2 . 


PROBABILITY DENSITY OF BUSINESS FAILURE IN E 
POUGHKEEPSIE, 1844-1926 


Age of 
years Craft, Service 
0 0.5 0.5 
1 0.213 0.192 
2 0.102 0.106 
3 0.072 0.081 
4 0.056 0.059 
5 0.044 0.044 
6 0.034 0.038 
7 0.028 0.031 
8 0.024 0.025 
9 0.020 0.021 
10 ө 0.017 „| 0019 


Source: Computed from Table 1. 


Thus, (0) can generally be represented as in Figure 2. 


S(t) 


p 850 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 

From the values of f(t) and F(¢), following Davis, we calculate Zit) 

the conditional density function of failure probability with time, in 

other words, the instantaneous probability rate of failure at time ¢ 
conditional upon non-failure prior to t: 


fo 
Z(t) =——_2. 
() 1-Е(0) ° 
These results are showhħ in Table 3. 


р ТАВІЕ 3 
CONDITIONAL PROBABILITY DENSITY OF BUSINESS 
FAILURE, POUGHKEEPSIE, 1844-1926 


gota Retail aut Craft Service 
0 0.57 0.365 0.5 0.5 
1 0.249 0.198 0.307 0.285 
2 0.190 0.182 0.187 0.195 
3 0.152 0.169 0.160 0.180 
4 0.138 0.148 0.142 0.154 
5 0.126 9.128 0.129 0.133 
6 0.114 0.107 0.112 0.130 
7 0.098 . 0.093 20.103 0.121 
8 , 0.089 20.077 0.097 0.108 
9 0.094 0.051 0.088 0.101 
"10 ^| 0.083 0.028 0.081 0.101 


Source: Calculated from Tables 1 and 2. 


Z(t) is of the form shown in,Figure 3. 
LJ 


Z(t) 


° 
BUSINESS FAILURES: ANALYSIS OF FAILURE DATA 851 


There is really little purpose to be served by searching for analytical 
expressions representing this behavior. This could only be useful if it 
were feasible to obtain general support from extraneous sources for a 
particular form of relationship. The only such support in this case is 
in relation to such trivialities as 


ё 
* FO, f(0, 201, 
Ро) =0; 70) = 20, 
F(t) monotonic increasing. ° 


It is, however, of interest to record that a good fit to the 2(0) values 
can be obtained, in each case, either by the exponential function 


Z(i) = ae 
or the hyperbola 


200 = 


t+a : 
The latter appears to be the more appropriate for the Retail, Craft, 


and Service groups, while in the case of Manufacturing trades the ex- 
ponential gives the better fit. These above functions were fitted to the ` 


data in the transformations . Е 
loge Z(t) = loge a — bt, linear in tand logeZ() € 
and 
ш. linear fn {арй Шы 
Z bb ЛЫНДЫ, 
The correlation coefficients corresponding to these lineae forms are 
shown in Table 4. uf 


«TABLE 4 


CORRELATION COEFFICIENTS FOR FUNCTIONS 
FITTED TO DATA OF TABLE 3 


Type of business Exponential Hyperbola 
|t se шшс 
Retail 0.91 0.99 
Manufacture 0.96 0.83 , 
Craft e 0.93 0.99 
Service x 40.98 


One advantage ot the hyperbolic “Jaw” is that the expressions for 
J(t) and F(t) remain faiyly simple. 
е 


2 


852 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


b 
Z(t) - Ur › 
then 9 
a NP ^ 
FO ei ( -) 
and : 
b QU NER 
59 -2—( =; 
Whereas 
{ Z(t) = ae 
leads to 
F(t) = 1 – erbte- 
n and 2 


ПО) E (ic Platt M x 


Both alternatives conform to the desirable boundary conditions and 
monotonic behavior exhibited by the data, 


Thope shortly to carry out a more detailed analysis of business mor- 
tality covering British as well as 


pletely accord with American. 


i REFERENCES 
[1] Davis, D. J., *An analysis of some failure data," Journal of the American Sta- 
tistical Association, 47 (1952), 113-50. 
[2] Hutchinson, Ruth Gillette and Arthur R. 


‚ and Newcomer, Mabel, “Study 
in business mortality,” American Economic 


Review, 28 (1938), 497-514. 


CYCLICAL FLUCTUATIONS IN FOUNDRY ACTIVITY 


Lrov» SAVILLE 
Duke University 


° 


D і INTRODUCTION 


ганю extreme sehsitivity of foundry operations to business change 
has been apparent for many years [7]. Only recently, however, has 
sufficient information been available to permit an aftalysis of the cycli- 
cal movements of foundry output as а whole. An adequate accumula- 
tion of data in a number of the Facts for Industry Series of the Bureau 
of the Census, some of it collected for the first time during World War 
II, now makes possible the construction of a seasonally adjusted index 
of foundry activity covering a period of business fluctuations. 

In general, foundries produce metal parts or castings to the custom 
specification of local firms in durable goods industries; consequently 
foundry activity is influenced by cyclical fluctuations in a wide range 
of geographic and industrial areas. Five general characteristics of the 
industry affect its sensitiveness to change: (1) The production of found- 
ries is dominated hy changes in the demand for a variety of products 
commonly classified a durable consumers’ goods, investment goods, 
and war materials: castings are employed as pases for pumps, lathes, 
and presses; as frames for pianos, lawnmowers, and ocomotives; 88 
wheels for railroad cars, airplanes, and machines; and as component 
parts for lamps, engines, and motors. (2) Castings are made by several 
thousand establishments operating "n many geographically separated 
markets; under these atomistic conditions, fgundry production de- 
scribes the activities of a wide range of firms and'tends to minimize 
the influence of one or a few firms Оп production totals. (8) The pro- 
duction of castings made from different metals is dominated by various 
technical requirements: use of,aluminum castings is regulated by the 
level of output of such objects as airplanes and portable tools, where’ 
lightness of weight is a major consideration; the quantity of castings 
made from brass and bronze, malleable iron, steel, and gray irdh-re- 
flects, respectively, the rate of manufacture of corrosion-resistant fit- 
tings needed on ships and in chemical plants, shock-withstanding parts 
for railroads, armor plates for tanks, and general comporfents for pro- 
ducers’ and consumers’ goods. (4) Since inventories of rough castings 
tend to be small, the prod&tion of castings is closely related to cur 
rent demand. Foundrymen usually make parts to the specification of 
the individual congumer and so find it difficult to accumulate castings 


853 


. NC 
e н 


variety of items but also a sizeable product. In 1947 their value added 
to product totaled almost two billion dollars, or approximately 2.5 per 
cent of the value added by the producers of all the manufactured goods 
in the United States (Table 1). 


à » 
TABLE 1 


ESTIMATES OF ACTUAL AND RELATIVE VOLUME OF 
PRODUCT, VALUE OF PRODUCT, AND VALUE ADDED 
TO PRODUCT BY CERTAIN CLASSES OF FOUND- 
RIES AND ALL MANUFACTURERS IN THE 

UNITED STATES, 1947 


\ 
854 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 198 4 
for future orders. (5) Finally, the makers of castings form not only a 
\ 
| 
ү 
vf 


Actual figures Percentages 
Value 
Group of Quantity Value of 
manufacturers | of product | product in | 89400 | Quantity | Value ot | Value 
d eue sions product in y added to 
js d ^ an oe dollars |9 Product) product product 
А (000,000) | (000,000) 
Aluminum and alumi- 468 256 185 dT 8.4 7.8 
num base foundries 
Copper and copper 1,061 394" 207 3.8 12.9 11.2 
base foundries 
see foun- 1,797 | 281 152 964" 7.6 8.2 
8 
Steel foundries _ j 2,592 380" 252 9.0 12.8 13.6 
Gray-iron foundries КУ 22,27 1,773 1,108 79.1 58.3 59.7 
Total of all foundries 28,129 3,043 1,854 100.0 100.0 100.0 
Total of all manu- not not 74,426 — — a | 
facturers available available" | 
a 


Source: [12, Vol. II, pp. 21, 595,3846, 560, and 564]. Data adj bl 
with quantities presented in Table ]. Data adjusted to make coverage comparable 
| ў e 


$ 
2. AVAILABLE FOUNDRY INFORMATION 


Almost all of the monthly information concerning the operations of 
foundries in the United States is found in the Current Statistical Service 

i and the Facts Jor Industry series of the Bureau of the Census. Data for 
brief periods and for special groups of producers have been collected 
by other governmental agencies (e.g., Office of Price Administration) 
and trade associations (e.g, Malleable Founders’ Society). Table 2 
lists the dates and designations of publications of the Bureau of the 
Census in which monthly reports of foundry activity are available. 
Series dating from 1923 and 1926 are shown for malleable iron and steel, 
but only from 1942 and 1943 for castings made from aluminum, copper, 


Ü 


CYCLICAL FLUCTUATIONS IN FOUNDRY ACTIVITY 855 


and gray iron. Consideration has not been given to castings of mag- 
nesium and lead for they are of minor importance, comprising in total 
about 2 per cent (by weight) of all nonferrous castings shipped in 
1953. Although castings made from zinc approximate in volume 
castings made from aluminum, they have been excluded from the 
inquiry for two reasons: (4) Comparable data concerning them is not 
available prior to 1946. (2) And, more important, the technology and 
market structure Associated with die castings; the predominant way 
of forming zinc, is quite different from the techniques associated with 
the sand and permanent molding methods used in making castings of 
other metals. 


TABLE 2 


BUREAU OF THE CENSUS PUBLICATIONS OF MONTHLY 
FOUNDRY OPERATING INFORMATION IN THE 
UNITED STATES, 1923 TO 1954 


MX М Designations 
Start of Publications in which data are included ӨТӨП, 


series (Dates are inclusive.) publications 
2 сере © 
Aluminum | January, 1942 | Series 1-1, 1-3, 1-6, ind 1-7, Jan. 1942 to Sept, | M24E 


1945; Series M24B, Oct. 1945 to Dec. 1945; Se- 
ries М24Е, Jan. 1946 to present. 


Metal 


Copper January, 1942 | Series 1-1, 1-3, 1-6, dad 1-7, Jan. 1942 to Dec. M24E 
1945; Series M24E, Jan. 1946 to pfesent. & A 


Malleable | May, 1923 Current Statistical Service, May, 1923 to June, M21-1 and M21C 
Tron 1944; Series 30-7, Jan. 1943 to Sept. 1945; Series 
M21B, Oct. 1945 to Dec. 1950; Series M21-1 and 
M210, Jan. 1950 tofpresent. 
Steel January, 1926 | Current Statistical Service, тъ. 486 to June, | M21-1 and M210 
1944; (Production of Commercial Steel Castings, 
only) Series 30-1, July, 1943 to Sept. 1945; Series | , 
M22A, Oct. 1945 to Dec. 1950; Series M21-1 and 
M21C, Jan. 1951 to Present. 
Gray Iron | January, 1943 | Series 30-2, dan. 1943 to June 1944; Series 305. M21-1 and M21C 
July, 1944 to Sept. 1945; Series M21A, Oct. 1945 
to Dec. 1950; and Series M21-1 and M210, Jan. 
1951 to present. (Miscellaneous castings series Е 
. first available Oct. 1944) : 


Currently, only unfilled-order and shipment data are being pub- 
lished; shipment figures are the more descriptive of the two. Unfilled 
orders tend to fluctuate not onlyewith the demand for castings, but 
also with the availability of casting facilities, for customers often place 


duplicate orders with a number of foundries during boom times in 


856 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


the hope of achieving prompt delivery from one of them. This com: ed 
plexity, and the fact that figures are not available prior to December, 
1945 for malleable iron and steel castings, reduces the value of unfilled 
orders as a measure of cyclical change. Since inventories of finished 
castings held by foundries usually are not large, shipment data corre- 
spond very closely to production information; when both ‘production — 
and shipment figures are available concurrently, no material differences 
are evident. 3 

Shipment series, for the five foundry industries, inflated to achieve — 
full coverage and comparability, are shown in Table 3. The universe - 
of firms has been ascertained with some precision, for the reports sub- 

mitted by each company during World War II were used in connection 
with the allocation of scarce materials; even establishments which _ 
may have dealt in the black market probably complied with the filing is 
requirements of the regulations in order to secure a legitimate quota 
of metal. 

The coverage of monthly releases varies widely; in the case of mal- 
leable-iron foundries, virtually every producer reported every month 
during the last ten years, while in the case of each of the other f oundry _ 
industries, reports were collected from all known producers in 1946 and © 
1950. Between these complete enumerations, samples of varying size _ 
were assembled each month. The Census Bureau revised the monthly _ 
reports for 1945 and 1946 on the basis of the annual reports collected 
from the universe of firms in 1946, and revised the monthly reports — 
for 1948 and 1950 on the basis of the 1950 study. Similar revisions — 
have been made in the monthly data for 1947 and 1948 by the author 
based on adjustments of anilual totals for these years published by the 
Census Bureau. А yew gainple of the nonferrous firms, makers of alumi- 
num and copper castings, was ostablished in September, 1952; the 
monthly totals for 1951 and 1952, have been revised on this basis. 2 
d ies ao on of the gray-iron industry the data reflect activity — 

of the firms in each of these industries, Shipments of gray iron 
comprise only three of the five groups of producers often classified 
i Tho broad category, gray-iron foundries. Miscellaneous gray-iron 
castings, molds for heavy steel ingots, and chilled iron railroad car 
wheels are included in the series because the techniques involved in 
their manufgcture and sale are generally similar to the processes of 
other, foundries studied here; cast iron pressure pipe and fittings and 
cast iron soil Pipe and fittings are éxcluded for they are standardized- 
inventory ems which are made and Marketed quite differently from 
the custom-made products included in the study. + 


; ў ы v 
evo: | ner | 62:39 06°99 АЛ БАЛ сє | 01:98 28°98 19:98 10794 19°28 6261 
08" E] 06:29 10:99 69:14 9699 9g: 14 28:02 09:69 | 91-92 | £0799 96:68 8661. 
08.09 60:87 | 60:29 28799 91:19 66:29 £999 9519 | srta | 99°92 | 86-79 = 79:89 2261 
60°89 90" £9 19:9 | ce $9 $1:99 32:79 week | 8:89 | geza | Gris 82:84 20°99 9261 
0022 19°89 88°82 99 T9 06:99 98-02 ал 21-91 [28:7 99:69 Ty'89 9201 
"69 9a" 19 ys'cg 39714 ТІРЕР [44714 21619 4999 $8'69 £0'6Z4 eUSL 09'28 1:3 
60 29°&9 25°94 82°89 13°18 39, [5:3 15'08 "vu "wu wu "wu EZL 
. (000) suo, 1095 —МОЧІ G'IGVSTIVIN t - 
y 06:34 87:94 98'89 | IL 019 
19:4L TLL 99788, 12:08 96:82 0:12 99:38 Ear 98:76 08:26 | 81-98 84:98 961 
08:06 39:08 01°86. 28°98 08:82 | 81:90 LL'OL | =0Р'Е8 96:98 66:06 8928 98:06 eger 
99:18 39:06 10101” | 90:16 19:001 | 98:18 St30r | vei | Үт P Дт | 90@0т | 997001 1967 
error | 6б?°/01 | 1/7601 | 9666 oF 26 99-14 1498 Ly'88 КАЛ 99°28 09°89 22729 0961 
09:99 87:89 [151 [255] 34:09 ІУ 00:39 Es 09:009. | T814 60:04 To" Lh 6761 
98:08 ST:88 90°86 59:88 79-88 ZEEL yy: 98 "68 1706 „| 21°86 28°88 90:28 ver 
907 69°28 10:86 96:68 84.94 82:12 17:98 26:06 02:66 ўў`06 99:66 92: 201 1961 
20°86 1426 8:901 | 22:66 08°86 88:64 ores ‚| 91:68 98°16 7098 98:4 99:78 Over 
99°99 yo: 9L 1'08 77°89 81°68 00°96 ge'zor’ | 00911 | Lert | eset | vO'OIT | 90°&1 9228 
sreor | телег | vorver | бїр | дева | 00611 | ogret | ОРОТ os'yet | O9'SPI | os:ser | OF: Fer »ver 
BISOT | doer | owser | os:zet | 08,221 | 06.801 | OUSPI | OF Ett 09: 221 sagt | OG'vIL | 09'20I ever 
oocyst | боб | 09"251 | osect | oo-sor | OS'CII | «09-80 * 007801 0p sit “emt | 09011 | 09STE [531 
(000'000) spunog—t13ddOO Pee 
10:89 81:99 12'19 26°09 voor 
20:29 OF Ig 9°99 69°89 06:09 99°19 28:90 09:29 39°19 greg |*9В УО 29°99 Е961 e 
ye" 89 87:9 £9°19 [2614 02:68 52.98 28°68 60:67 16'87 wy 20:07 gz’ ОЎ ©061 
$8798 y8$e | 18.1} 91:88 92-17 Ӯ: OE 89°oF 308Y YS: LY 81:09 Z987 69:09 те6т 
86°98 88°99 oF 18 69°08 €8° LF 96°98 98V 862v 11768 896v 10°F8 T8'£8 0967 
918 eroe "| ®8'®® 161g 21715 w Моб | 79:77 gy: 91g вг.те szeg 6761 
Lg'9€ 62'8g 6$'6£ 98'6g 149g gr'cg 08°62 БАСА 044 ЕР'ӨР Loy ПІ 34 19 
$89'6£ 149g 06'c* 86° LE 82° SS 16°18 C6 FE 69°68 Ilr PL OP ‘86°68 10`®Ф 1961 
09° Le 86°88 lo'g* 99'7£ 88°98 9I'I£ 98°08 Ie" te 0466 y6' 16 89°86 BS" FS 961 
кыл 96°8Т ©8`61 08'S1 8'37 erie 10°68 034 ce OF 82°09 60°F 62°Sh 9961 
00°68 os" OF 1 4/4 OF IF 19634 09'88 OF oF 09° FF 06° SF OF 6h 06'S* P OF FF PPOL 
19° IF 34 19034 90° 1h 8£'S£ 9g'2g 09'9g 06' LE 9c' LE y6'8£ go" ee 92'68 РӨТ 
yc v£ £9'6£ 76`88 61°08 93" 63 60°26 99'96 yl 4 00' $6 Tr eg €8'61 *y0'0c [421 
. (000'000) spmog —AQNTNWO'TV 
oat ‘AON 490 "Meg "ny | sme omp | sew may | prew "qoi "usp 105 


CYCLICAL FLUCTUATIONS IN FOUNDRY ACTIVITY 


E ж. 


SNOILYALONTA TVNOSVAS чоя AALS: 
NONLAVHUD (NV “IHALS 'NONI-GIISVSTIVIN 


Г 


© ЯПЧУ1 


пгаумп ‘SALVIS CALINA AHL NI SONLLSVO 
"HgddOO ‘NONINWOTV JO SLNSINdIHS ATHINOW 


TABLE 3—(continued) 


ә 


B 
E 
H 
ts 
E 


ER 
E 


25 


CYCLICAL FLUCTUATIONS IN FOUNDRY ACTIVITY 


aod g6 poruosoudoz 312400 peyodas yeqs SMSU om 44 Se OT uo эчу Aq 
93912400 ој 03 рәзғрш әләм. FOL £761 10} samy Apqyuo Jy- t xd po pouasaadas edens pound oy лн М ОЧ e ener АЧ amirga 
Sq Jo sweq oq uo 100105 eq In оз pojegut oam уубї то] вәли snsus) Sm P Y Sus sv Ayanos Aisnput [Tm] Jo 3499 19d 96 poyussses 
“umy ur Wo sodbe aq sory pus “EFBT чэдшаоәст о ЕРОТ "Amr “Poured oui ur рр ощ Se т Jg Пү Jo siuamdiuS Р лш E ONST 08 peyuos 
Tum utens BoP pastato jo поңәпрол ot уең uotdumsss ai чо очуп OL PETRUS v Бурай o3 porspur sva ‘ganso (9915 190290200 Jo 
Wononporg ‘sonas Oquireas 38980] ә "peousmqnd aoa quawdrys ou FGI “SUE оз 9сбт 'Aiwnuwr mozg fons exse Па 


i 
d 


әләм. оюл Ет. ur unaq Атузпри әцу jo o3v9A00 | оз юшд—пол әүүвәүүвүү [OT] uonensmmupy sot 1 aq} JO juourdos srq} 
jo aouvizoduir oq) J0 шо poseq quso iod d i" aig к 1 old jo эор sq Aaa әң p 
pus suonv1odo [[03 ‘sainpour *Gp6I 'зәдшәдәсү 03 PTOL "Араг тполд—зәйаогу -учәә тәй $0 “PFET ut ®упәшгца [9303 


E Н D sua 41 su ourse oq) oq 03 pourgess sun пезос 
"isewpo srq} Jo aousyrodurr Сне De Ого! dou „даро [IS, £1039)69 әң 10] MOTTE OF POAT эләм, ЕЎӨТ pus GGT. ло] SEDEM DOE -—umumuniy :eAono 
ww suopiior ogmedg `хәў Яш озон рене ios PRU TIGE on oF әри eyuourpm ipe үегәпә:) в SITU, wy pois зәрә: питер) 9а jo mein sen 
» £s |е 3:828 £'£6L y g6L YS6r 
8`©68 8.878 9:496 £'8£6 3:906 £106 6° Ic0T [353003 91601 с`2601 9:566 3° 0Z0T £961 
6" £001 £606 Е'ВРОТ 6:9y6 0:188 155-54 y:o89 9° 626, 89501 6: S001 8`2001 +8201 2961 
6:668 y:6001 эт 0566 17901 2`968 £'e6lT CREER 9"091@.] T ZEI 8:7901 £'09IT 1960 
v'6201 € 6FO0T LII [26201 6'6201 8'798 "Sot "886 6'106 7506 17008 ors Os6r 
1574 07229 TL19 6622 6 6LL 819 Te. +064 8'198 £996 17918 ¥ 16 [1429 
8:826 £'186 €'SIOT 6:£96 8:926 7218 6:996 Lb: 6:966 €: 6901 ©`916 9:896 SHEL 
8'v96 9'$16 9° Tc0T 9:606 LES тов 6:616 $:696 17816 27196 17968 +996 1761 
9'818 9'848 8:976 6:978 67398 206224 1929 L'889 57082 [3:22 © 269 8:779 9761 
wae 0°699 9° 189 T279 $689 27189 87822 | 9008 £'62L $958 9 TOL y vos убт 
. 
9:972 8`19/ 1:292 9:982 1:024 17189 1:094 ©`88/ $39, 0:228 [34:774 7922 Xn 
87604 279921 0'862 £'862 EFL g'o 17218 1 £08 8'118 gy ¥ 699 87699 Te 
(ооо) Suor, 3:008—NOHI ЛУЧО z E 
ENS а өтт 8221 yor * 
[3541 19492 отт gogr £T 97681 Тут 9'g9r 9:62T [315.11 187 © 191 £961 
L191 £' SPI &'S9r Ў`8°т [714 0611 sas [4 721 TOLE гел 1781 9:681 Z961 
T S91 БОЛ e| 6'68T 217091 T LLE gort 6 LLI Z081 8'SLI 8`©8Т [097-11 '99Т 1661 
217891 0-671 v' ect #981 ©`06т 1`66 9`$Є12 ё©`61їТ 0`801 Lott ur $888 061 
6'78 8:92 gu 9:8 £'06 0:64 геп 0:601 Lott [35343 ¥' Ser LI [14:14 
9:181 6'91 28.891 £'err Eas 9021 Test ТЕРТ ¢- Ost т`&91 `1 [38541 SP6I 
Farias 8:081 ‘SPI 9:281 FOI O° ZIT 1:61 0171 [3241 0:961 217961 1-681 1961 
621 97081 £181 ©`9@Т L'6ct сит стт 87081 9'ӨРТ "FOL 9:09 |e S:c0T 9761 
691 595 EOST [2s Yel £681 тел 6°@61 17161 9'с26 ¥ 161 B01 9761 
0202 0760€ +'012 FUA 9912 07981 17012 9`122 0'Sic L'9g6 [34:24 1:093 POL 
8:507 gieta | 2:00 9:21 9.861 0:807 9:802 6:80 17916 8.805 6:80% 6:807 | 861 
"681 1:78 1:002 6:281 2.881 9:821 ВЕЛ [0221 7:961 8:261 6:921 Y: ©761 
0:21 9:261 0:821 6: SST 6:791 6:471 0:081 ©`8ЕТ © ФЁЄТ &`661 Sa [331 IP6I 
ec 87901 FOIT 2778 ys 1`9/ 1799 8°99 9°89 89 8°88 ¥°S0T over 
E “AON о "dog "y Apr oun AU чау qoiejw "qo uef E 
E | 
^ (panusu02)—£ ATAVL 
= pcc FU s 


b 


860 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


8. SEASONAL ADJUSTMENTS 


The data in Table 3 show a seasonal pattern. Relatively low rates 
of production during summer months, even during the war years, 
reflect the difficulty of obtaining high productivity in foundries during 
warm weather. These recurrent variations in output have been re- 
moved in two steps; each series has been adjasted for the number of 
working days in the month and for seasonal fluctuations. , 


8.1. Calendar Factors 


The working schedule of foundries varied widely during this period j 
in general, it involved a single shift, a five-day week, and the observ- 
ance of six holidays. No modification was made in the figures for vari- 
ations in number of shifts; even during the war years substantially 
less than one-half of the foundries worked more than one shift [14, p. 
17]. Further, adjustments for shift operations made in response to 
changing business conditions, even if they were possible, would tend to 
reduce the sensitiveness of the index to cyclical movements. 

The first adjustment made in the figures is for changes in the num- 


days per week throughout the period. No corrections were made for 
variations in working schedules which occurred às a result of changing 
business conditions in the early thirties or in the 1949 recession. Such 
corrections епі о make the index less sensitive to cyclical fluctuations, 
because alterations in the usual work week are, themselves, measures 


The Second adjustment is for changes in the number of holidays. 

. Prior to 1942, five hi idays were Observed: New Year's Day, Inde- 
pendeiice Day, Labor Day, Thanksgiving Day, and Christmas Day. 
During 1942, 1943, and 1944 only New Year's Day, Independence 


А 
* 


ү: 


ee 


Bonn FLUCTUATIONS IN FOUNDRY ACTIVITY } луу OL 6 
3.2. Seasonal Factors í г 


Pronounced seasonal changes arise from natural working conditions 
“of the plants and fluctuating demands for their products. Since con- 
current information is available for all metals from January, 1043, the — 
ten-year period subsequent to this date was selected for a provisional ~ 
computation of seasonal factors. The twelve-month-moving-average 
“method of adjustment was employed, the resulting indexes were com- 
puted by use of the modified-mean technique whereby the extreme 
values found for each month were excluded from the calculation of the 
mean.! This method was useful in removing the influence of such inci- 
dental variations as the steel strike of July, 1952 from the final indexes. 
A second series of seasonal indexes was computed, based this time on 
the six post-World-War II years of 1947 through 1952. These factors 
_ (Table 4) show a more pronounced pattern of seasonal trend than those 
based on a ten-year period including the war years of 1943, 1944, and 
_ 1945, and the post-war readjustment year, 1946. This is consistent with 
= experience in other industries reported by the Federal Reserve Board 
[8, p. 1263]. 
The ferrous series, malleable irop, steel, and gray iron, have higher 
Seasonal peaks in the spring than in the fall; in this they follow the 
pattern of durable manufaetures in general. On the other hand, the 
nonferrous indexes, aluminum and copper, behaye similarly to the 


T producers of nondurable goods and reach "their anual production 


- heights in the fall? This contrasting situation may be explained in 
part by the differing utilization of ferrous and nonferrous castings. 

_ Ferrous castings are used extensively in heavy construction and trans- 

_ portation machinery, industries showing high,rates of activity in the . 
Spring; nonferrous castings are employed" in aircraft, instruments, 
hardware, and light tools, industries having little seasonal fluctuations 
or high production levels in tht falk[3, pp. 1292-3]. 


4. MOVEMENTS IN THE SERIES 
After adjustment for calendar and seasonal fluctuations, the foundry 
series exhibit broadly similar patterns of output change as Һер move- 
from phase to phase of the business cycle. Variations are still present, 
however, in the individual responses of the industries to different in- 
dustrial situations. о 


чаба! gtuationj, ОООО ИЕ SERM 
1 This method was utilized because fairly reliable information could be obtained from ofily а lim- 
ited amount of material. Had informagion puris а longer period been available, the technique used 
by the Board of Governors of the Federal System would have been employed [1]. 
? In the newly revised Index of (рене, eed a single set of seasonal adjustment factors is 
Spplied to all primary mejgls [3, p. 1264]. 


^ 862 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


4.1. Relationships Among Foundry Series, 1943-1954 


The concurrent information available for all series shown in Figure 
1, has been adjusted for seasonal fluctuations (Table 4) and stated 
comparably in pounds on an average-daily-shipments base. Most of 
the fluctuations can be identified with nationally-publicized events. 
Variations in the series during 1943 and 1944 resulted from two causes: 


> TABLE 4 


SEASONAL ADJUSTMENT FACTORS OF SHIPMENT DATA OF 
SELECTED CASTINGS INDUSTRIES IN THE UNITED 
STATES, BASED ON 12-MONTH MOVING AVERAGE 
ADJUSTMENTS FOR 6-YEAR PERIOD, JANUARY, 

1947, TO DECEMBER, 1952, INCLUSIVE 


Month Aluminum | Copper Steel Gray Iron 
January 99 101 98 103 
February 105 104 107 103 
March 103 105 106 105 
April 104 100 103 105 
May 101 102 102 101 
June 97 95 104 100 
July 80 ‚82 » 82 85 
August, 91 95 90 93 
September .108 т 106 102 103 
October” "108 102 100 101 
November 106 106 102 102 
December 98 102 104 99 


Source: Table 3 and accompanying text. 
+. hd rs 

(1) Adjustments were made to the data for all years on the basis of 
seasonal factors computed from the years 1947 to 1952, inclusive, a 
period in which seasonal variations were larger than during the war. | 
The use of the same seasonal factors throughout stresses the mechanical | 
character of the foundry index and the changes in seasonals which 
~have-ogeurred under specific conditions in the past. At the same time, 
these factors distort the adjusted series during 1943 and 1944 by intro- 
ducing counter-seasonal variations. (2) Because controls were handled 
үе i calendar-quarter basis by the War Production Board, they seem 
ayo imparted some special quarterly fluctuations to the output of 
oundries. Apparently, more optimistic estimates at the beginning of 
a quarter resulted in larger allocations during the first month than in 
subsequent months, when more limited stocks appeared to be avail- je 


"spunod щ sonras дар oSuzoAw 0} рәгәлпоо y puo g элд, :99:008 
*(000*000) S1 N04 *pejsntpy 19008995 ‘594845 poyrun em urs3unsw) pejoereg Jo syuourdtqg pate әйзтәлү '[ ЮМ 
SHuv3A 
6v. 8. 9v, Sv. аъ. £v6l 


803 “ 


ooz 
Ts я! 
шуо WOO 140S 


о 
$ 
189-8 
$,$ 9 


002 NOU! AVH9 


00099 о 


T -F SIVIYALVN 30 


noiggg03u Sys мум mWIMIS NOISS3038 
sis 13315 чузмоя 2193165 661 


CYCLICAL FLUCTUATIONS IN FOUNDRY ACTIVITY 


LI 


E 


864 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 у 


[ 


able; more pessimistic estimates tended to produce the reverse E 


[13, p. 51]. 


i 
In later years reconversion, strikes, and recessions were important - 


influences. All of the series dropped in August, 1945, as the result of 
V-J Day; the more radical decline and slower recovery of the alumi- - 


> a $ 
4.2. Relationshizs Amon} Foundry Series, 1929-1947 


Since no monthly information Concerning three of the series is avail- 
able for the cyclically important, pre-war Period, less satisfactory 
annual data are employed to assess the magnitude of their reactions to 
Severe depression and Pecovery, Produetion indexes based on figures 


Broadly Speaking, the conclusions reached earlier concerning the 


„behavior of the casting series are reinforced by this information. In 


general, the production changes of all foundry industries show wide 
cyclical movements; again, individual dissimilarities are apparent. In 


1937, Tecoveries from depression lows were much more complete in the 
ferrous than n the nonferrous Series, 


declined in all fields except aluminunewh, 
aircraft production were reflected. 


© 


с> д 


^s 


TRI IE 


hd 


"[9cg1 ров yze1 “dd '£] шолу sexopug eA1osot [wropojt 
*spunod uy syuourdygs «пер 0391249 оз рәудәлпоә PUV ‘F IQUI, шолу $1029) qjt& pojenípw A[Teuosucs ‘g әїд®, uro1; вәцәз [9938 PUL под AQLLI :9omog 


GSLLSOf(GV ATTVNOSVGS 'eAmnpouT 55961 0} EZGI 'seyvjg рәп aq} ut sue 1әә35 роз uoi[-o[q€ 


5 
ES 
90 AMEN Jo syuewrdmqg Aeq әЗвләлү цүм soinjowjnuv]q AQEMA ров uononpoiq [erjsnpu] Jo sexopur jo uostrduro)) `g "DL 


SuGA V s 
oS. ЄС. DOOR RR NU x T nc ek QUT Pu M. 


28. 
8911 
ort 
Y 
(000000) 
SaNnod 
SIN3WdIHS AIYA 39vH3AV - NOU! 318V3'TIVW 
Н 
а) Л H 
| nid 


N 
МУ ^ TN. 


D 
002 И а 
SSYUMLQVINNYW 318vsina 30 X30NI 


a 3982939 9. 8 
BEEN 
F 
ps 
| 
=> i 
О Т Wr ut 
ENERO INA. 
ITE EN IEEE 


ooo 


(ОО! = 66-261) p^ 
N / Моцопаоза TWISNGNI зо хзомі | 7 
n 
A 


- доо a МЫ Р” v / 0001 
АСА r эЧ ш] 
á = / 
== Y / F Е / | 
m. 
002: — 0002 


38888 9 '8 


CYCLICAL FLUCTUATIONS IN FOUNDRY ACTIVITY 


^ 866 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER юм) 


5. COMPARISON WITH INDEXES OF INDUSTRIAL PRODUCTION 
In Figure 2 the longer record of operations in the malleable i 


TABLE 5 ' ; * 


INDEXES OF PHYSICAL PRODUCTION: ALUMINUM, COPPER,’ - 
MALLEABLE-IRON, STEEL, AND GRAY-IRON CASTINGS 
IN THE UNITED STATES FOR SELECTED 
YEARS, 1929 TO 1939, INCLUSIVE 


1929 =100 
Metal 

Near Malleable Gray lit 

Aluminum Copper ТОН Steel Iros -. 
1929 100 100 100 100 100 
1931 48 51 ЕЕ AT 
1933 33 18 36 21 37 
1935 58 24 ., 59 37 55 
1937 56 58 90 86 94 
1939 79 43 69 50 79 


2 


Bouree: [11: 1981, pp. 839, 880, 908, and 992; 1955, p. 1079; 1987, Pt. I, рр. 936, 1023, and 1081; . 
1089, Vol. Hi, Pt. 2, pp,,198, 202, 205, 342, and 347]. \ 


mits an examination in detail of changes in output. of at least two of 
the foundry industries during a period of wide cyclical movements, ү 
information not obtainable directly from the limited historical series iia 
available for aluminum; copper, and gray-iron. Е 


5.1. Foundry Data More Volatile ° 


Three differences between the foundry series and the indexes of the 
eral Reserve Board account for mosi of the contrasting reactions: 
(1) When compared with the larger and broader indexes the foundry 
information is, in а sense, a small sample of business activity, as such h 
it tends to display the well known larger sampling errors of all small 
samples. (2) The casting series 
the Federal Reserve Board’s 
dure has been followed b 


n 
\f 


me | 
Жай 


CYCLICAL FLUCTUATIONS IN FOUNDRY ACTIVITY 867 « 


adjustments are modified continually and estimates are made before 
all returns are received; these practices tend to introduce subjective 
evaluations into the preliminary estimates which may be carried over 
to some extent in the final or revised figures. (3). In the foundry series 
the utilization throughout the period of a single set of seasonal adjust- 
ment factors, based on the 1947-1952 period, underadjusts some im- 
partant seasonal movements prior to World War II. This is especially 
notable in the malleable-iron data, for the seasonal requirements of 
the manufacturers of automobiles, railroad equipment, and heavy con- 
struction machinery were much greater in the 1920’s than they are now 


[3, pp. 1292-3 and 2, pp. 2-4]. 


5.9. Foundry Data Show Litile Secular Trend 


The general level of foundry activity has changed only a small 
amount in the past thirty years. The shipments of steel castings in 
1929 were exceeded by less than 8 per cent in the boom years of World 
War II; the output of malleable iron expanded only about one-third 
between 1929 and the Korean War. On the other hand, the Index of 
Industrial Production showed an increase of more than 100 per cent, 
the Index of Durable Manufacturers, 150 per cent. This characteristic 
of foundry data may be explained by the persistent shift from castings 
to other methods of forming metal parts, such as stamping, forging, 


and welding. o 
LJ G 


5.3. Foundry Information More Sensitive to Cyclical Change 


The large utilization of castings for capital improvements is reflected 
in the amplitude of changes in output levels during business fluctua- 
tions. In fact, the relative volatility of each Of these series seems to 
vary with the importance of capital goods in the index. Thus, the 
foundry series are more sensitive than the Durable Index, while the 
latter, in turn, is more variable than Industrial Production Index. 

As the period advanced, tke proportionate contractions from peak 
to trough (1929-1932, 1937-1938, and 1948-1949) became progres- 
sively greater in the foundry series than in the Federal Reserve indexes; 
thus, the declines for malleable iron and steel from 1948 to 1929 were 
40 per cent and 58 per cent, respectively, while they were only 10 per 
cent and 17 per cent for Industrial Production and Durable Manu- 
factures. This development may be explained by the шеу system of 
weights being employed by the Federal Reserve Board; instéad of 
basing the importance of an indystry on the gross value of its product, 
its influence is determined by the value added to its product. This. of 


‹ 


868 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 


course, decreases the significance of industries in which the value 
to the product is small in proportion to the total value of the produ 
and increases the emphasis of industries with a high value-added rati 
The importance of the change in weights here is that it now renders th 
Federal Reserve indexes less sensitive to cyclical change than the) 
were before the revision, since items with high value-added.ratios such 
as tools, instruments, and machinery are generally less sensitive, to 
business change than goods with low value-added ratios such as coal, 
lumber, and blast furnace products. [Cf. 3, p. 1243 and 1277]. 


e 


5.4. Foundry Information Changes in Phase With Other Indexes 


In the foundry series it is difficult to distinguish the actual peaks | 
гапа troughs of business activity from the residual seasonal fluctuations 

. and the incidental changes resulting from the small character of the 
sample. Apart from these problems, once cyclical changes are located _ 
in the Index of Industrial Production, similar movements are apparent 
in the malleable and steel series. Since the Index exhibits a slight lead — 
at peaks and a rough coincidence at troughs [9, p. 60], it may be as- 
sumed that the foundry series do not depart markedly from this timing. 


6. AN INDEX OF FOUNDRY ACTIVITY 
P 
Although foundry series generally move at the same time and in the 


same direction, fhey do'exhibit individual variations that tend to dis- | 
tract attention from the more fundamental responses all foundries 


foundry activity. 


6.1. Alternative Weighting of Foundry Index 


— 


CYCLICAL FLUCTUATIONS IN FOUNDRY ACTIVITY 869 


of the five industries without regard to the physical or monetary im- 
portance of the castings shipped. (2) The method of weighting the 
series according to the number of pounds of castings shipped by each 
industry places greatest emphasis on the gray-iron group. These found- 
ries were found to exhibit greater production stability than the other 
series because their castings are used more extensively in consumers’ 
products than they are in war or defense lines. 

(3) The technique of weighting the series acjording to the value of 
the castings produced by each industry results in giving less importance 
to the gray-iron group and somewhat more to the costly nonferrous 
castings. It imparts emphasis to aircraft and marine castings, promi- 
nent in defense and war activity. (4) The use of value added data as a 
basis of weights is consistent, with the weighting system used in the 
Indexes of Industrial Production and Durable Manufactures. In effect, 
this weighting reduces the relative importance of nonferrous castings 
by eliminating the costly metal from the figures. Actually, the index 
based on value added to product is almost identical with the one based 
on total value of product. 


6.2. Selection of the Most Appropriate Weights. 


It is not enough to come to a logical conclusion in the selection of 
weights; it is necessary, also, to see how well the index employing the 
weights behaves in practice. Two sets of data tre available: (1) biennial 
figures for the period, 1929-1939, and (2) annual information for the 
period, 1943-1953. 

6.2.1. The pre-war period. Shown,in Table 6 are weighted indexes of 
foundry activity derived from the Censas of Manufactures data for 
the years 1929 through 1939. The relatives, ort a 1929 base (Table 5), 
have been combined and weighted by the importance of each metal 

in 1947 (Table 1) to form indexes of foundry activity similar to those 
described above. For comparison, five related indexes of general eco- 
nomic activity are presented: (1) Private Investment, a narrow meas- 
ure which includes: construction of new plants by business, purchases 
of producers’ durable equipment, and changes in busines$ inventories, 
It is, in effect, a measure of domestic, non-government investment, 
(2) Total Investment, a broader series which adds to (1) above: net 
foreign investment (current balance of payments) and government 
purchases of goods and services. This is consistent with the usual defi- 
nition of investment. (3) Durable Manufactures and Industrial Pro- 
duction, current Federal Reserve indexes adjusted to the, 1929 base. 
(4) Gross Nationa] Product,’a measure of the total economic activity 


С 


^. 870 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 — 


in the United States. This series and the two investment ones were 
computed from data stated in constant (1939) dollars [15]. 

A comparison of the related indexes in Table 6 reveals a genera] - 
similarity of behavior, with two important exceptions: (1) Private Ip- | 
vestment, Total Investment, and Durable Manufactures reacted more 
violently to the depression than the other.two series; this ebservation _ 
is consistent with the notion that investment and durable goods ac- — 
tivity change more drastically than general business operations during - 


0 TABLE 6 


INDEXES OF FOUNDRY ACTIVITY WEIGHTED VARIOUSLY WITH 
1947 FACTORS COMPARED WITH INDEXES OF RELATED 
ACTIVITIES IN THE UNITED STATES FOR SE- 
LECTED YEARS 1929 TO 1939, INCLUSIVE 


(1929 = 100) 
Foundry Indexes Weighted Related Indexes 
Physi- 
Year a г Private | Total | Durable | "9*0 | Gross 
cal Value trial i 
Equally} yo | Vol- | added | Invest- | Invest- | Manu- Produc- | National 
Tune ume ment ment factures tion Product. 
1929 | 100 100 100 100 100 100 100 100 100 
1981 44 46 45 45 40 46 52 68 84 
1933 | 29 35 32 | 33 n 31 « 40 63 72 
1935 4T 58 | 49 50 |? 45 50 63 80 86 
1937 7 91 855 86 77 68 92 103 102 
1089 | > 64 a 70 70 66 71 82 98 106 


Source: Tables 1 and 5; [15, pp. 26-27]; and [3, pp. 1324 and 1326]. 


depressions. (2) From 1937 to ?939 Total Investment and Gross 
National Product increased while the other series decreased; evidently 
spending by the Federal’ government, which increased Total Invest- 
ment and was reflected in a larger Gross National Product, did not 
carry over to Private Investment, Durable Manufactures, or Industrial 
Production sufficiently to expand the series. Fluctuations in each of 
the foundry indexes follow movements in Private Investment. 
6.2.2. Theawar and post-war period. In Table 7 indexes similar to those 
закі Table 6 are presented for a later period and computed to à 
later base, 1947-1949. The foundry indexes are derived from the 
monthly values in Table 3; the related indexes are from the national 
Income‘ and Fed 


ome eral Reserve data used in computing the comparable 
seriesin Table 6. 


а 


information is available оп а current dollar basis; it 
утте not employé because the precision added by quarterly figures did not appear to be as great as 
be completely removed from the data. 


#5 


CYCLICAL FLUCTUATIONS IN FOUNDRY ACTIVITY 871 T 


Substantially more diversity is evident in the movements of the re- 
lated series during this, than during the earlier period. The enormous 
expenditures by the government for war equipment, especially in 1943, 
1944, and 1945, account for a large portion of the differences. Con- 
flicting changes which occurred during the later years are, however, 
a logical óutcome of the économie process. Castings are employed in 
tke construction of plants and equipment for the manufacture of pro- 


TABLE 7 a 


INDEXES OF FOUNDRY ACTIVITY WEIGHTED VARIOUSLY WITH 
1947 FACTORS COMPARED WITH INDEXES OF RELATED Е 
ACTIVITIES IN THE UNITED STATES, 
1943-1953, INCLUSIVE 


(1947-49 = 100) 
Foundry Indexes Weighted Related Indexes 
'hysi- Ind: 
Year Physi | Dollar | yao | Private | Total | Durable шы | Grow 
Equally | үш. Vol- | Added Invest- | Invest- | Manu- | Produc- National 
ume ume ment ment | factures tion Product 
1943 | 101 т 87 86 °27 163 162 127 103 
1944 | 105 77 89 88 33 183 159 125 110 
1945 89 75: 81 80 42 162 128 107 108 
1946 93 83 E 87 102 | 103 86 90 97 
1947 | 100 100 102 102 *96 97 101 100 98 
1948 | 109 102 104 104 14 10%» 104 104 101 
1949 82 82 82 82 90 98 95e 97 oj 101 
1950 | 110 104 107 106 134 114 116 112 110 
1951| 124 117 120 119 138° 141 128 120 118 
1952 | 110 100 103 103 122 146 136 124 121 
1953 | 119 106 ul no |^ 153 134 
—h 


Source: Tables 1 and 3; [15, pp. 26-27]; [3, pp. 1324 an} 1326]; and Federal Reserve Bulletin, 
March, 1954, p. 295, for “Preliminary” 1953 values for Index of Industrial Production and Index of 


Durable Manufactures. о a 


ducers’ goods, consumers’ goods, aad war goods. The outbreak of war 
in Korea caused an initial expansion of Private Investment and con- 
comitant foundry production to create and adapt new factories before 
large government orders could be filled. Thus, increases,jn government 
spending to swell Total Investment were preceded by the п йет 
of private investment funds by foundries and others in building or 
modernizing facilities with which to make the proper war goods.’ Con- 
trasting movements of Private and Total Investments from 1951 to 
1952 emphasize the curtailing of Private Investment (and foundry 
activity) after the production facilities for the new war were developed. _ 


ле ЗА s 
> 
_ * Actually the Federal government's total purchases of goods and services were larger by $2.1 
billion (1939 dollars) im 1949 than they were in 1950 [15, р. 27). 


‚® 872 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


On the basis of these comparisons, weighting the foundry series by 
value-added figures seems to be most appropriate: (1) It gives each 
foundry industry and associated area an importance relative to the ex- 
penses of producing the castings without regard to the cost of the metal 
from which the castings are formed. (2) It tends to give less influence 
to nonferrous castings than the equally-veighted schemeand thus 
provides an index less heavily affected by occurrences in the aircraft 
and marine fields. (3) Itgives less prominence to gray-iron castings and 
more to aluminum than the physically-weighted index and so tends to 
present a more balanced picture of the over-all investment economy. 
(4) It is, in essence, a workable average system of weighting which 
possesses the good points of all of the foundry series without the vul- 
nerability of either of the extreme weights. Values for the Index of 
Foundry Activity are shown in Table 8. 


7, EVALUATION OF THE INDEX 


The appraisal of the Index of Foundry Activity is facilitated by a 
comparison with the Indexes of Industrial Production, Durable Manu- 
factures, and Private Investment in Figure 3. Specific attributes are 
apparent, 

7.1. Timing of Movements Roughly Similar to Those in Index of Indus- 
trial Production > 


Similarities of movement are evident between the Foundry Index 
and the Index of Industrial Production ; Since the latter has been found 
to exhibit some lead at ,peaks'and general consistency at troughs [9, p. 
60], it may be reasoned that the Foundry Index, also, is timed in this 
general fashion, 


2 


if Magnitude of Movements Greater than Index of Industrial Produc- 


Over the period investigated, the size of the fluctuations in foundry 
Geist has been substantially greater than changes in the Index of 
Industrial Production and Gross National Product. In other than war 
years, changes in the Foundry Index approximate in a rough way the 
amplitude of movements of private investment (Tables 6 and 7); in 
Spite of the difficulty of comparing annual and monthly data in Figure 
`8, the generalization is supported also by täis illustration. 


Be 


CYCLICAL FLUCTUATIONS IN FOUNDRY ACTIVITY odd 


DURABLE 
PRODUCTION \ ° 


INDUSTRIAL —\ 
PRODUCTION 


'53 


Fra. 3. Comparison of Index of Foundry Activity with Indefes of Industrial 

Production, Durable Manufactures, and Private Investment in thé United 

States, 1943 to 1954, Inclusive, SEASONALLY ADJ USTED. (1947-49 =100). 
Source: Foundry data from Table 3, seasonally adjusted by factors from Table 4, and combined in 

а weighted-refative index using the value-added-to-product figures from Table 1.él'able 8. and [3, рр. 

1824 and 1326]. : . ^ 4 


» 874 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 
N TABLE 8 
INDEX OF FOUNDRY ACTIVITY IN THE UNITED STATES 
SEASONALLY ADJUSTED 
6 (1947-49 =100) 
a. 
Year \ Jan. Nov. | Dec. 
4 
1943 78 85 90 
1944 94 83 85 
1945 88 75 74 
1946 71 101 94 
1947 102 108 | 102 
1948 / | 105 105 | 100 
1949- 95 68 82 
1950 86 86 86 97 100 106 115 115 117 120 117 130 
~ 1951 121 | 121 125 | 125 125 126 119 120 116 113 112 | 108 
1952 106 | 110 109 107 107 87 75 104 104 107 112 110 
1953 16 | 117 115 116 120 112 117 112 102 102 97 96 
1954 96 93 88 86 


Bource: Foundry data from Table 3, seasonally adjusted by factors from Table 4, and combined in a 
weighted-relative index using the value-added-to-product figures from Table 1. 


7.8. Determination of Index Valueg Highly Mechanical 


Index values may be obtained by applying å stereotyped set of 
procedures to the raw data, A disadvantage of this method is the trans- 
mitting of local, extraneous variations to the final index. Two alterna- 
tives are availabe; (1) identifying the special events by supplementary 
notes, and (2) editing the original information to remove the extraneous 
material. The first alternative has been selected here on the assumption 
that some incidental fluctuations "and possible misinterpretations of 


the index are easier t,egmprehend and to correct than an unknown 
amount of editing.* 
LJ LI 


7.4. Availability of Data Relatively Prompt 


Foundry data in the Facts for Industry series are customarily issued 
about Six or seven weeks after the end of the month for which informa- 
tion is published. This is about as soon as the Preliminary Index of 
иза! Production is released by the Board of Governors in the 
mimeographed Business Indexes supplement to National Summary of 
Business Conditions. It is substantially earlier than private investment 
data collected by the Office of Business Economies of the Department 
of Cotnmerce and the Securities and Exchange Commission are pub- 
lished in the Survey of Current Business [5, p. 9]. 
° 


& 


=  OYCLICAL FLUCTUATIONS IN FOUNDRY ACTIVITY 875 
ў REFERENCES n 
[1] Barton, H. C., Jr., ^Adjustment for seasonal variation," Federal Reserve Bul- 
letin, 27 (1941), 518-28. 
[2] Board of Governors of the Federal Reserve System, Seasonal Adjustment 
Factors, Washington (1940). 
[3] Board of Governors of the Federal Reserve System, “Federal Reserve 
monthly index of indastfial production,” Federal Reserve Bulletin, 39 
* (1953), 1239-1328. 
[4] Clark, J. M., The Economics of Overhead Cost, Chigago (1923). 
[5] Foss, M. A., “Investment programs and sales expectations in 1952," Sur- 
vey of Current Business, 34 (1954), 9-12. 
3 [6] Haberler, G. von, Prosperity and Depression, Geneva (1939). 
[7] Hussey, Miriam, Foundry Activity as a Business Barometer, Philadelphia 
(1950). 
[8] Mitchell, W. C., What Happens During Business Cycles, New York (1951). 
[9] Moore, G. H. Statistical Indicators of Cyclical Revivals and Recessions, 
New York (1950). 

[10] Office of Price Administration, Statement of Considerations Involved in the 
Issuance of Amendment No. 8 to Revised Maximum Price Regulation No. 125 
—Nonferrous Foundry Products, Washington (1945) 

| [11] U. 8. Bureau of the Census, Biennial Census of Manufacturers. 1931, 1935, 
1937, and 1939, Washington (1935, 1938, 1939, and 1942). 

[12] U. S. Bureau of the Census, Censis of Mahufactures: 1947. Washington 
(1949). 3 

[13] U. S. Bureau of Labor Statistics, Handbook-of Labor Statistics, Washington 
(1948). (1947 Edition). ? ° 

[14] U. 8. Bureau of Labor Statistics, Wage Structifte: Foungries, 1945, Wash- 
ington (1946). 

[15] U. S. Department of Commerce, “National income and product of the 
United States, 1952,” Survey of Current Business, 33 (1953), 6-32. 

[16] War Production Board, Wartime "Production Achievements, Washington, 


(1945). КЕ 
M . 
. . 
. s ‹ 
P 
. 
е ee 
° T о 
. 
е 
E б <e 
° 
ae 


CARGO LOSS IN FERRYING OPERATIONS 


Жї лттЕв L, Deemer, Jn. 
United States Air Force* 


N FERRYING operations of valuable items (e.g., aircraft spare p 
I needed in a theater ef operations during a war), the number of i а 
to be carried in each aircraft is frequently under the control of the 
planner. By using more aircraft he сап make the loading per aire 
smaller and hence possibly reduce the probability of large losses at th 
risk of increasing the probability of small losses. (The small lo 
also increases the actual ferrying cost.) Because of the value of. 
items it may be a good investment to buy this insurance. ENS 
.. In making his decision as to how many items to load per aircrafi Te 

the planner therefore needs to know, as a function of the loading, the ` 
probability- of losing any given number of items. Sometimes the mean 


In this paper expressions are given for the mean and the variance of 


the nuthber of items lost, for the probability of losing a given number 
of items and for the probability generating functions. The тоте l 


. Each aircraft carries r valuable items, so that a total of N — kr items 


use Во Viewsexpressed here are those of the auth Y м й : 
Td M polici ce озне ае ӨГ the "ui Stand srenot to be construed as reporting official 


1 (m.2), (m —1, 2, 3, 4) representa formula z for Model m. These are exhibited in Table 1. 
Y á 876 


л 


{ә 


CARGO LOSS IN FERRYING OPERATIONS 877 


ditches. In the event that the aircraft ditches it is assumed that the 
items leave the aircraft and float. (If the objects are inanimate, it is 
assumed that they are jettisoned.) Some or all of the life rafts or 
flotation devices may be equipped with radio transmitters or radar 
reflectors to help the search. 

It is assumed that there js a constant probability p that an aircraft 
will be forced to ditch (i.e., make a forced landing on the water) before 
it has delivered its items. Iti is also assumed that, each trip is independ- 
ent of all the other trips, so that the distribution of number of aircraft 
ditched is given by the expansion of (p+q)*, where g=1—p. 

When an aircraft is ditched, a search is made for the floating ob- 
jects.2 Some of these may be recovered. The items not recovered will 
be referred to as lost. 

Four models are considered, each model being defined by a set of 
assumptions on the behavior of the floating items and the character- 
istics of the search operation. 


2. NOTATION 


k=number of carriers used ? 
r —number of items per chrrier * 
N — kr — total number of items carried 
p- probability that а carrjer will ps 
а=ї—р 
t= probability that an individual ‘tats willbe found? 
s=1-t 
АЕ probability that a clump г lost items will be located 
=1-f 
Аф, p.) = Cp" I f р, че 
=0, if j>k or 7<0. š $ 
A(0, me 0)=1 oi t 
=0*/at* 
КТЕ expected value of the random variable X under model m 
V,.(X) = variance of X under model m. 


е 
© 3. DESCRIPTION OF THE MODELS « 
Model 1. By the assumptions of this model the ditched items (jr in 
number if j aircraft, are ditched) are distributed over a wide area so 
that the cónditional distribution of number of items rec8vered, given 
that j aircraft have ditched, is given by the terms of (¢+s)* where t 
is the pe of recovefing a single item; в=1—& .. 


1 The cost of main search facilites isnot relevatnt to this problem because these are mai 
tained in any case for the Feacue of crews. я 


rf 
It 


) 
` 878 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


Model 2. Here the items are assumed to float together (either be- 
cause they are tied together or because the wind and waves have not 
Separated them) so that if one item is found they are all found. Then 
if f is the probability of finding a clump of r items, the conditional 
probability of finding ir items given j aircraft are ditched is the term 
in f? of the expansion of (f4-g)?, where g=1—f and as before j is the 
number of aircraft ditched. $ % 

Model 3. Here thesitems do not clump to the extent they do in 
Model 2. We assume that there is a probability f of getting in the vicin- 
ity of the r semi-dispersed items and then having arrived in the vicinity 
of the items there is a probability ¢ of finding a single item. (One might 
get in the vicinity of a clump, for example, by finding the ditched air- 
craft or debris from it.) The conditional probability of getting in the 
vicinity of ¢ clumps given that j aircraft have been ditched is the term 
in f! of (f--g)/. For each such clump the probability of recovering a 
items is the term in і of (t4-8)r. 

Model 4. This is like Model 3 except it is assumed that the way one 
gets in the vicinity of a clump is actually to find an item. For example, 
one of the r items on each aircraft might have a special radio trans- 
mitter or a radar reflector. Here" then the probability of finding the 
first item of a clump of r items is f. Having found the (first item, the 
probability of finding each successive item in that clump is ¢. 

cma! Comments on the Models. It is not claimed that the mathe- 
matical models here considered fit exactly any actual ferrying situation. 
For example, it is somewhat unlikely that any actual ferrying situation 
fits Model 1, though it might fit when the items are dropped individu- 
ally by parachute over a wide area. Some of the other models seem to 
be adequate Tepresentafidhs of actual situations. 

. Models 1 and 2 are special cases of Models 3 and 4 as follows: If f=1 
M M 3 we have Model 1. If (—1 in Model 3 or Model 4 we have 
el 2. 


, 
4. RESULTS 


s Ihe result#are given in Table 1. In the summation indices [(a— 1)/r] 
medns the greatest integer less than or equal to (a—1) /r; for a=0, 
this is equal to —1, 


. ^ ^ 
5. DISCUSSION OF RESULTS : 


) In Models 172: and 3 the expected number of items lost is not a 
Tenn of the loading, as evidenced»by the fact that k and r appear 
only as the product kr— N in the expressions for the expected number 


Б 


CARGO LOSS IN FERRYING OPERATIONS 879 € 


TABLE 1 
SUMMARY OF RESULTS 


Model 1 
Expected number of items lost kprs (1.1) 
Variance of number of items lost kprs(t 4-rsq) (1.2) 
The probability generating §urfttion M,(¢) = [p(t-+s¢)"+¢]* (L8) = 
Fotmulas for probability of losing* 
exactly а =br +c items (0 Sc <r) 3C;A (k, p, 3f Gr, s, a) (1.4) 
(s*/a!) De (q+pt")* 
(useful for small a) (1.5) 
Model 2 
Expected number of items lost, kprg (2.1) 
Variance of number of items lost kpr*g(1 —gp) (2.2) 
The probability generating function M:(¢) = [p(f+9¢") +¢]* (2.3) 
Formula for probability of losing ex- 
actly a —br items Pekla +p) (2.4) 
Model 3 
Expected number of items lost kpr(1—tf) (3.1) 
Variance of number of items lost kpr [ar (1 —)*--tf(s rtg) ] (3.2) 


The probability generating function Ms($) = {р (t-+84)"-+96"]+4}* (3.3) 
Formulas for probability of losing ex- i 
actly a=br-te items (0 $c <r)* Lr Âk, p, j) AGS À) 


inj Ar, t, jra) (3.4) 
b - e 
E {C.#gnprs""/(a—nr)!} 
neo  De-"(gd pfi) 
Є (useful for small a) (3.5) 
Di (АФ, p, 67 /Gr —2)1] 
Dirala Fjer) 
(useful for a kr small) (3.6) 
"Model 4 * 
Expected number of items lost kp[r — (s--r0f] (4.1) 
Variance of number of items lost * — kp[fts(r —1) +fg(s+rt)*+4 
[r—f(s+rt) 12) (4.2) 


The probability generating function Mail) = [p [f(t-F56)7 +е]+а}* (4 


) 
« er 


Formulas for probability of losingex- Jo; Ух A(k, p, 3) AG, £49 n 
actly a =br +c itemst (0 Se <r) Ali(r—1), t, jr—i-a] (4.4) 
b 
З 2 X Собра nr) ри" 
no (g+pfi A 
е (useful for small a) (4.5) 


A a ЫЫ сы с=с == 


* F; is the sum from j =({a—1)/r] #1,to j $k. 
+ Zi is the sum from i=j—b to i=min (jr—a, 0. 
. LJ 


f 


» 880 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


of items lost. In Model 4, the expected number of items lost is an in- 
creasing function of r, the loading per aircraft. This results from the 
fact that in getting in the vicinity of a clump one item is found. In all 
models the variance is an increasing function of r, the loading per air- 
craft. 

After deciding (in conference with the operations officer) which 
model is most applicable to the actual situation, the analyst can pre- 
pare numerical tables showing the probability of losing a items for 
various loadings, based on estimates of the probabilities p, t, and f in- 
volved; p, t, and f can frequently be estimated rather reliably from 
previous experience. These tables can be used in deciding what loading 
to use. 

The probability of losing a items has been found useful in making 
decisions as to the desirability of developing salvage equipment, and as 

. to the best type of salvage equipment. 


6. DISCUSSION OF FORMULAS 


6.1. The Probability Generating Functions? 


The functions M,,($) "ane the expected values of ¢*, where X is the 
random variable, number of items lost: à 


МФ = P[X = 0) +$P{X 2 1] +. + ovP(x = kr}. 


and hence the probability that X=a is the coefficient of ф° in the ex- 
pansion of М(ф). 

When either k or r is small, M(¢) is easy to expand and the numerical 
evaluation after the expansion is"quite easy. When both k and r are 
large, the expansion’ ‘and numerical evaluation are usually tedious. 


Example . Ф 

When r=1, the coefficient of ¢*in M;(¢) is 

«(рўе + q)*-*(pfs + ур)“. 
When r=2, &=3 the expansion of М; (д) is 
7 7o Ms) = Ab + 34286 + (ЗАВ? + 34°C)? 
+ (B? + GABC)¢? + (3B3C + 3AC?)o* 
20009. E BBC + Cs, à 

where: A= fft q; B=2 рз; C  bfst-gp. 

? For a more, 


З complete discussion of probability gencrating functions see W. Feller, An Intro- 
duction to Probability Theory and Ite Applications, New York, John Wiley,and Sons, 1950, p. 212 ff. 


a) a's 


E. 
CARGO LOSS IN FERRYING OPERATIONS TERRE A. 

The advantage of M,,(¢) for calculating P{X =a} depends not on 
the value of a but on the values of rand k. The other special formulas, 
discussed below, are suitable for special values of a, independent of r 
and k. i Amy 
6.2. Special Formulas for Large and Small a 


Whether the special formulas (1.5), (3.5), (3.6), and (4.5) are simpler 
than the general formulas (m.4) depends not ов the value of т and Ё 
but on the value of a. 

The utility of these special formulas may be illustrated by exhibiting 
the formulas for P{X=0}, P{X=1}, and P{X=6} for (k=6, r=1) 
using the equations (3.4), (3.5), and (3.6). 

By (8.4) 

P{X = 0} = A(0, f, AQ, t, NAG, p, 0) 
A, f, DAG, 4 DAG p, D +: 
+ A(6, f, 6)AG, t, 6)4 (6, р, 6); 
A(6, p, 1) [AQ, f, 04 (9, t, 0) 4-4, f, HAC, 1, 0] 
Ev A(6, р, 2) [А (2, f, xA, {#13 zt A(, f, 2)А(2, t, )] 
+ see К A 
+ A(6, p, 6) [4(6, f, 54 (5, 1, 5) + AGG, f, 6)А (6, t, 5; 
A(6, p, 6) [AG, f, 04 (0, t 0) + A(6, fe1)A (1, #0) 
+- -- + A(6, f, 6)4(6, t, 0)]. 
By (8.2) NES 
P{X = 0} = (9 + 0/05 e" 
P[X 21] = opa nO GI FD. — 
By (8.6) Les . А 
P{X = 6} = »*( +45)" 

These examples show the great simplification possible with the spe- 
cialized formulas, On the other hand, the use of equations mA mav. 
be systematized so that the routine of setting up & computing table 
and computing from it may be easily learned and for the average com- 
puter this ig an advantage. In Section 8 an example is worked in detail 
and in Table 2 a sample computing sheet is shown. . 

e 


ШП 


pix 


P{X = 6} 


т. OUTLINE OF PROOFS OF EQUATIONS ee zu 
Three random variables will occur i these will be denoted as follows: 


^ 
e 


D 
^ 882 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER юм “ 


X: Number of items lost ^ 
Y: Number of aircraft ditched 
Z: Number of clumps located. 


The index m refers to the models. For example, Р„{ Х=а} mens 
the probability of losing a items under model m. 

A perfectly general equation for Pn { X 5a] is 

Pa X = 6] = У DP, {xX =a Y-2j4;Z-i) 

(7.1) 6 D | 
Pez = il Y = j} PAY = jJ. 4 
Conceptually j runs from 0 to k and 4 runs from 0 to j.Intheequations ` 
given in Table 1 for P,(X—a }, limits on j and ¢ which do not cover 
these ranges are given because for some values of 7 and j some of the 
probabilities are identically 0 and hence these terms are omitted. For 
example, in P, [ X — kr] only j=k gives a non-zero value of P, {X=kr 
|Y2j; Z-i ]. Equations (m.4) of Table 1 (m = 1, 2, 3, 4) are based on 


E 


1 


(7.1) with the substitution for the conditional probabilities of the bi- 
nomial terms appropriate to the particular model. 
The probability genéreting functions (m.3) are the expected values | 
of $*: А 
Г e if 
(7.2) | o ED) = PX =a}. 


a-0 


These may be readily evaluated by using the right side of (7.1) for 
P[X =a}, first evaluating б 


koe 
27 ФР„{х =a|Y=j;Z=i]}, 
then multiplying by Pn{Z=1| P=j}" and summing over ¢ and finally 
multiplying by P,{ Y=j} and summing over j. 
The probability generating functions may also be evaluated by 
Ў formal maripulation of conditional expeetations: If X is a random 
“Yaridvle with a binomial distribution: P{X=j}=A(k, р, j), then | 


Еф) = (р + gy. 


The conditional expectation of X given Y and Z may be evaluated 
using this fact; the conditional expectation of X given Y and finally the 


unconditional expectation of X may be evaluated by taking the expec- 


“A description of this method is included at the suggestion of a refareo. 


CARGO LOSS IN FERRYING OPERATIONS 883 


tations of the resulting expressions with respect to Z end then with 
respect to Y. 

'The special equations (1.5), (3.5) (3.6), and (4.5) are derived by 
using the two following facts: 


First, 
ъ LI 


(7.8) [». + а р, ч) = A(nyp, w + 1). 


2 


Where D, means partial differentiation with respect to g. This may 
be demonstrated by direct application of the operation indicated on 
the left. 


Second, a solution of the difference equation 


p w 
; - —|r 
(7.4) F(w + 1, p) P [2.+ = (w, p) 
(where g=1—p) 
р“ * . е 
(7.5) : үйл p -— Yo 


This may be proved x induction on w, " ə being 1 Varie true 
for w=0. 
For example, (3.5) is derived as follows. From (7.1), putting п=]—% 


(7.6) РХ = a] = DD ALG – 9), s, a nr]AG, g, 405 р, 


we 
er 


which we write ° 
zm Р„(а).. s Жад 
4 
(7.7) Р.п) = У Alk, p, DAG, g, ntr 
jen е 
* = бур" + pfi). MON. 
Let w=a—nr; then P, plays the role of F of (7.5). 
Hence, t d i ^ 2 
(7.8) P,(a) = C,'g^pw(s*-"j/(a — nr)! Dea + 0)", 
e e " e == 
and < *, 


é 


> E 


) 884 - AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 

: k 

(7.9) Ps{X =a} = У Сед" [577/(a — пт)!]Р,—""(а + pfi 
n=l 


which is equation (3.5) of Table 1. 

The same general method is used to get (1.5), (3.6), and (4.5). No 
equation for large a, i.e., a near kr, is needed in Model 4 because when 
ais large there are few terms in the sum over’? of equation (4.4). 


The moments for any model may be obtained from M,,(¢) by sub-. 


stituting ¢ for ф, which gives the moment generating function. The 
oth moment about 0 is then the ath derivative of the moment gen- 
erating function evaluated at ф= 0. 

Alternatively the moments may be evaluated using conditional 
moments and successively removing the conditions. See, for example, 
M. H. Hansen, W. N. Hurwitz, W. G. Madow, Sample Survey Methods 
and Theory, Volume П, New York, John Wiley and Sons, 1953, p. 59 ff. 


8. EXAMPLE 


A numerical example will be worked in detail to show the computing 
schemes used and to indicate how decisions may be made based on 
these models. The values tised for probabilities and costs in this exam- 
ple are not authentic because true values based on actual experience 
are classified. 


Model З is the mode} assumed for the example. According to this ' 


model ‘the probability of an aircraft ditching is p; the probability of 
getting in the vicinity of the 7 items which were carried in the ditched 
aircraft is f; and having arrived in the vicinity of the items the prob- 
ability of finding a single itein is t. 
Lo 
8.1. Numerical work for the example 
For the example the following numerical values are used: 


kr, the total number of valuable items to be carried, 6. Each of 
the four possible loadings per aircraft: 1, 2, 3, and 6 is investigated. 
£7.20 and t=.40 when no salvage device is used; 

wem f+) .40 and t=.60 with a salvage device. The salvage device might 


be a radio transmitter which sends automatically when the items 
are jettisoned. 


7 The expected number of items lost (i.e., itched ‘and not recovered) 
is, for any loading: $ 
without salvage device: 6(0.1) [1.—(0.2) (0.4)]=.55. 
. with salvage device: 6(0.1) [1— (0.4) (9.6)]=.46. 


Б 


E 


= 
Ф. 


v х E 
CARGO LOSS IN FERRYING OPERATIONS f 885€ 


The salvage device reduces the expected number of items lost by 
0.09 items. If the items are worth a million dollars each, a salvage 
device which costs less than $90,000 would be worth developing. The 
cost must be net cost for salvage devices for all of the six items, less _ 
the value of the remaining salvage devices after the ferrying operation 
has been eompleted. a ё А 

«The calculation, of the probability of: losing 0, 1, - - •, 6 items for 
each of the four possible loadings with and without a salvage device 
can be arranged so that the computing labor using equation (3.4) is not 
excessive. The computing seheme shown in Table 2 has been found 
quite satisfactory. 

This scheme is simply a method for systematizing the calculation 
of the double summation which gives the probability of losing exactly 
a items, P{a}. Usually when making decisions based on Pfa}, itis — 
necessary to calculate P {a} for all (К, т) sets, and when this is neces- 
sary the general formula m.4 for P{a} is frequently more efficient than 
the specialized formulas for P{a} which are also given for some models. 

Table 2 is for k=6, r=1, no salvage device. k=6, r=1 was chosen 
for the detailed description of Table 2 because this (k, т) pair requires 
the most extensive gomputing sheet. 

On an actual computing sheet only the numerical values are entered 
because the literal values are not necessary and may be confusing to 
the calculator. Blanks in the body of Table 2 #е to begead as zeros. 

The entries on the computing sheet which are іп Roman type are 
original inputs, based on the assumed values of k, r, p, f, and t. The 
values in italics and bold face are derived values. The operation being 
performed in arriving at the italicized values ig, the multiplication of 
P(a|i, j} by P{i|j} for a fixed j and adding éver all possible ¢ values. 
In terms of the formula given in Table 1 the italicized values are: 

. 


› * 
X AG 0AGn tir 9) 
itj 
where A(0, t, 0) — 1 by definition. For example, the italieized value of 
7187 found in*the golumn for 3 items lost is the sum of froducts 
(.512)(1.00) - (.84)(.600) ++ С090)(36) + (008) (216) = 7787, or in 
literal notation 


А@, f, 0)4(0, t, 0) + AG, f, NAG, t, 0) + AG, f, 24@, t 0) o 
өс + AG, f, 4G, 1,0. 


To find P [a]. the italicized valyes are multiplied by the P{j} on 


© 
e 


(s BmynAwoo [wmv we uo po 
те 9 10 с jo Aiyqeqoad oq], 


1 100000" (9'd*9) 9 
E Loo: (07'9; (29у [o КА 6980" xar ToU (9*7"9)y | 1000" err of Me 
B thet 079 rnv | os pepy | o SY Ww Groy | F 
B em Que y (E997 6180" УУ £ 
à 5 ae Erne f 
2 oor (0*7*0) Toc O's" | 0 5 
a «90000 (g''g)y | g 
= 010° (e*oY та 
z i 
zZ 2 i б 
a 09" (0 I 
B 001 (0*2*0) 0 
Et ? 1000" groo G'é'oy| Ӯ 
z * (pu'py | 9920" (РУ | 9100" (p mv y 
É i [d r 
b 960° АЕ д 
E 900 (07У | 0 
Q 484 1808 шо 9000: wo (g'd'oy | e 
erc (ou'Ov | zer (I95)V | 888 (Cey | 996 (s'Ov |.som (ЕЛУ | e 
9r (09:0) | sv (19:0) | oF Gov 960: gL OY t 
Т 09° (021 y| or (DV We (ТОР, Т 
>| 001 (0*i0Y р as (07У | 0 z 
© 5 2 EEN To EAE: 
tors” e... | Sar +200: 800° (c'd')Y | g 
a 9c omy “er (у | 9r ту | so (ушу € 
E 09 (Dy | OF G''Dy 26. оу I 
b в, oot (0F0)¥ £ ve (070) | 0 
a 86. sr ^ ewe (9) | т 
л . Шур зт» жр 
do 
E S oor ory | oor(ov'ov| o neg (o'é'ov | o 
Е 9 s П e z I 0 A EH a EH 
D msa [ERE ha HH 
4807] вшәўү Jo zoqumN. ` 18 z ER SEI 


5886 


09°=8 ‘or’ =? “08`= 5 
‘oc’ =f ‘06° =5 ‘or =d ‘T= 4 '9—3 “9010 eS«uA[eg ON 


LIAGHS 9NLLOdWOO ATAWVS 


€ WISVL 


a 


6 
bo 
CARGO LOSS IN FERRYING OPERATIONS ; 887 © 


the same line and these products are summed for each column. The 
results are the bold face values at the bottom of the columns, which are 
the P{a} values. From Table 2: 


* pio] = .5604= (.5314)(1.00) + (.3543)(.08) + (.0984)(.0064) 
o +, (9146)(.0005) + (.0012) (.0004) 


ll 


© 
° 


P{5} = P{6} = .0000. 


Table 3 summarizes the probability values for the example. Values 
are presented for the four possible (k, т) pairs, each with and without 
a salvage device. The individual probabilities are given in the left 
hand column of each pair of columns, and cumulated values of the 
probabilities are given in the other column. The cumulated values 
show the probability of losing at least so many items. For example, in 
Table 3, the .440 in the third column meays that the probability of 
losing one or more items (at least оће item is .440. 


8.2. Decision making for the example — * 


If one loading resulted in a probability ofloss Which was less than 
or equal to that of any other loading for all number of items Yost, that 
loading would be uniformly best. 

The necessity for executive decigion is based on the fact that this is 
not so. If one wishes to minimize the prebability of losing one or more 


items, the best loading is k=1, r=6. But if ©пе wishes to minimize the 
probability of losing 3 or more items, the best loading is k=6, т=1. 
However, the loading k=3, r=2 yields almost as low a probability of 
losing 3 or more items, as the loading k=6, r— 1, and the cost of run- 
ning the operation has beenehalved since only half as many aireraft 
are involved in the operation. It appears from Table 3 that for the 
parameter values involved in the example the loading іа more impor- 
tant variable than the use or non-use of a salvage device. g 
One might wish to minimize the probability of losing 6 items if the 
crucial consideratjen were to deliver at least one item. This would be 
true for example if the items were special code books. One code book 
might be absolutely essential forthe performance of an operation and 


dowel SHAM. A. ОБ, 
1954, where a detailed dis- 


5 A referee suggests that reference be madéto ће recent book by D. Blat 
Theory of Games and Statistical Decisions, New York, Jolfh Wiley and Sons, 
cussion of statistical détision theory may be found. . 


e 
e 


0 
5 
a 
E 
m 
a 
m 
Q 
A 
A 
7 
E 
д 
А 
2 
E. 
д 
e 
: 
E 
: 
8 
Е 
@ 
| 
E 


ә 
7390] вшә}ї JO JoquINU jo sanquA әй әчү шолу pojernurno wurnjoo 2urpooo1d oy} ш senjvA oq JO SUMS oq; OIG иштоо SPY ur вәпїпә өц, ж 


oo | ооо: | 180° | тво" | жоо" | жоо" | мо: | 200° |ооо` | ооо: | тоо" | тоо" | ооо" | ооо: | ооо" | ооо 9 
zoo’ | 200° | sso" | жоо" | 900° | zoo | воо" | cov | ооо" | ооо" | тоо" | 000° |ооо` | 00° |ооо: | ооо s 
$90: | 900° | тво= | 900° | soo: | zov |or | 100° | ato: | eo | сао" | 1г0° |тоо | 200° | 100° | тоо E 
620° | rr | өвс: | соо | get: | эп: |zor | zer | вю" | 200° | өсо" | yoo | soo" | 200° | е0" | cio £. 
160° | zro | 660" | £0 | est: | 180° | вил" | or» | гт" | ser | ore | vix | 120° | soo | вво" | 980 EE 
seo’ | 200 |oor | тоос |»zt | re |ssr | oto: | сес" | svo |eoc | sco: | вив" | zoe | orm | ne I 
ооо°т | zoe | o00't | oo | ooo'r | гв" | 0001 | ста" |ooor| 992° | 000"т | zeze | 00071 | eco" | ooot | 009 0. 
gui, guii2) Sw} sua} |a uii En шә} wuna | 80193 Surio) 
SUTO | aypar |* 779 | amar |* 7 | -apur |* 779 | spur |* "0 | aay |* 9 | arpar | | мш | 0 | arpar 
вор _ 
оорләр УЯ ооләр әорләр әрләр ооләр ооләр | әорләр Зло 
затое TEAL “aque ON ADU ЧИА cas on |o CAT TAL “aque oN лүе EAL ATUS ON go 'oN 
às 
(9-4'T-3) n (=4'%=) [| (@=4'e=%) (=4'9=% 
япайухя AHL ноя SALLITIAVAOUd AO AwyNWaAS x 
€ dISVL ^ 
е © = 
а 2 
E 
3 


TANTES 


je 
CARGO LOSS IN FERRYING OPERATIONS 889 « 
the delivery of all six code books rather than just one would simply 
increase the ease with which the operation could be performed. Under 
these conditions one would ordinarily choose the loading that mini- 
mized the probability of losing 6 items, even if this loading led to the ~ 
highest probabilities for all other numbers of items lost. This is an _ 
example of a rather typicalsitua ion: it is impossible to put a monetary 
value on the items being carried, and hence executive judgment is. 
essential. = 

Tf the items were key executives of a corporation, it might be con- 
sidered essential to have four of them survive. In this case one would 
wish to minimize the probability of losing 2 or more items (executives). 
From Table 3 it appears that k=6, r= lor k=1,r=6 are almost equally 
good. If in this case one had no salvage device k— 1, r=6 would be the 
choice since this costs less than k=6, r—1 and the probabilities are 
equal. If one has a salvage device, however, k=6, r=1 isa better load- 
ing than k=1, r=6, since it leads to a slightly lower probability of 
losing 2 or more items (.071 vs. .091). The decision as to which loading 
to use depends on the extra cost of making six trips instead of one com- 
pared to the extra utility resulting from agreduction of 0.02 in the 
probability of losing 2 or more executives. But here again the utility 
cannot be measured in money and the decision must be made by execu- 


tive judgment. 
e ^ 


. [ 
LI 
° 
ес 
ee 
e 
LJ LJ 
LI . 
е 
е 
е 
LJ е о 
° 4 ә 
° 
e 
e 
ч "CS -«— 
* е 


THE EXPERIMENTAL APPROACH IN THE TEACHING 
OF STATISTICS* 


EpwiN G. Ops 
Carnegie Institute of Technology 


1. INTRODUCTION ^ 


T A session of the American Statistical Association, held five years ago 
A at Cleveland, I presented a paper' on the use of instructional aids 
in teaching statistical quality control. At that time it was noted [2, 
pp. 223-24] that, in the past, the average teacher of elementary prob- 
ability had made little use of experiments in his teaching and the re- 
mark was made that 

It is hard to understand why he failed to appreciate the pedagogical value 
of designing an experiment to illustrate a point of theory, predicting the 


result, running the experiment, and then taking the consequences if it 
turned out wrong. 


Clearly, the existence of some “value” in the use of experiments in 
teaching statistics was implied. i 

In view of the time which has elapsed, it seems.appropriate to take 
a fresh look at the matter. 


2. (TYPES OF EXPERIMENTS 
D m 
In a recent dictionary [4, p. 352] an experiment is defined as: 


A trial made to confirm or disprove something doubtful; an operation under- 
taken to discover some unknown principle or effect, or to test some sug- 
gested truth, or to dgmonstráte some known truth; ... 

D 


> 


The four classes of*trials can be reduced to two by combining aims, 
one, three and four. Then the types of experiments described in the 
definition might be ĉlassified as those performed : 


(1) to test hypotheses, 
(2) for exploration. 


EJ 
И] 3. DEMONSTRATION LECTURES о 


It would be possible to cross-classify experimentation in learning 
statistics on the basis of who performs the experiment, The so-called 
experiments ased in demonstration lectures may be done by the 
eae rece ee lt 


* Presented under a slightly different title a nni i i isti 
E es t е Ai 1 tistical 
sociation at Washington, D. C., December 29, 1983. nS Meeting of the American Stai 
references at the end of this paper, зе i int publication [2]. Bracketed numbers refer to the list of 


$5 


890 


p 


EXPERIMENTAL APPROACH IN THE TEACHING OF STATISTICS 891é 


teacher, by the students, or both both working together. Any individual 
student may be a participant or an observer. Ё 

Upon reviewing my own demonstrations, it is not at all clear how 
,many of them should be classified as experiments at all. For analysis, 
a lecture-plan from a manual [5], prepared for use in the intensive 
courses in statistical quality control given in various industrial centers 
during World War II, is presented below. Some description of the 
equipment used is given in [2, pp. 224-25] bit it is sufficient to note 
that the bead-box of reference ordinarily containgd 1152 white beads 
and 48 red beads. 


II, 4.—THE CONTROL CHARTS FOR FRACTION 
DEFECTIVE AND NUMBER OF 
DEFECTIVE ITEMS 


Objective 


1. To present the uses and purposes of a chart for fraction defective or 
number of defective items. 

2. To give the techniques necessary for the construction and utilization of 
these charts. 

3. To demonstrate the variation of fraction defective in samples drawn 
from a lot og process having a constant proportion of defective items. 

4. To indicate the sensitivity of the limits in detecting a process change. 

* ° 


Procedure T ә 


ө 

1. Make introductory remarks on the uses of one of tese charts? including 
the fact that the necessary data may be readily available in the form 
of day-by-day accounting records on one hundred per cent inspection. 

2. Point out that the box of beads is to represent a lot of material produced 
by a machine; that, after each sample, we must imagine that the lot is 
removed and а new lot presented, Thiseseans, of course, that the lots 
are uniform and the process is controlled. Thb samples should verify 
this fact. bi ° 

3. Take three or four samples ofe50 beads in order to indicate the proce- 
dure and method of computation. 

4. Draw and record twenty samples. А д 

5. Plot values for number of defective items and for fraction defective. 

6. Compute values for central line and control limits, ewplaining that the 
limits sre 3-sigma limits. gu 

7. Put limits on both charts, then point out that they tell the same story. 
At this point discontinue consideration of chart for number of de- 


fective items. E h 
8. Examine chart for indication of control and state tat since we have 


analyzed past data and, seer to be in control, we can use pas the stand- 
ard value, p', and estend the control limits for use during production. 
9. Have students make gharts for use in controlling the"process а: 


pare to regord and plot new sampling results. 


4892 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


10. Increase the percentage of defectives in the box and draw samples until 
a point falls outside the control band. 

11. Tell the students about the process change at the beginning of the 
second set of drawings, pointing out that it did not show itself imme- 
diately. | 

12. Continue sampling until there are twenty samples in the second set. 

13. Have the students compute the limits for the new set of drawings and 
place them on the chart. ^a 

14, Give students the value of original fraction defective in the bow]; poirit 
out that, since thedirst twenty samples of 50 were in control, their total 
could be considered as a single sample of 1000. 

15. Using original fraction defective, caleulate control limits for samples 
of 1000 and point out relationship of these limits to the first and second 
values of p. 

16. Return to consideration of the chart having two sets of limits and discuss 
the significance of the area common to the two bands. 


Principal points for emphasis 

1: Charts of this kind have proven to be very useful and easy to explain to 
management. 

2. Many students have found it desirable to start their use of control 
charts by analyzing data on fraction defective. 

3. Unless samples are quite large, these charts are not very sensitive to 
small changes. 7 x 

4. Control charts for measurements are, in general, to be preferred. They 
give much more information for process analyzis. 


A critical review of the»outlined procedure raises the question as to 
what the student learns from it. Unless ample time is used to give him 
careful explanation he may learn little. I am convinced that much ex- 
perimental work falls short in garnering full educational value because 
the lecturer has been niggardly in the time allotted to orientation. 

Given sufficient time aad some imagination an instructor can teach 
a good proportion of an elementary statistics course from just one dem- 
onstration of this type. 

In the first place, it can be pointed out that the bead box represents 
^ universe or population. The populaticn is finite and well-defined. 
Other populations of interest to statisticians may not be so well-defined. 
They may be 86 large as to seem almost infinite. 

ndly, the individual units can be readily classified into two 
се оп the basis of color. If a red bead is called а defective, there 
dui little Argument as to whether or not an individual unit ів defec- 
Mi the universe is characterized by a single parameter, denoted 
Edd devo the quality under consideration, the classification 
would оп measurements and the finite universe would be 


Л 
J 


EXPERIMENTAL APPROACH IN THE TEACHING OF STATISTICS 893 


described by a frequency tabulation, needing at least two parameters 
to summarize it adequately. 

Fourth, if we are interested in finding out what proportion of the 
population is defective, we might count the beads. In general, this pro- 
cedure is too expensive so we must depend on the uncertain 
from a sample. Obviously a single unit is not sufficient to represent 
the universe so we need *to'examine several units, How large a sample 
should we take? How should we take it? What statistic should we 
calculate? jk А 
А discussion of the above questions can carry ds through the ele- 
ments of probability; the binomial and hypergeometric distributions, 
together with the Poisson and normal distributions as approximations 
to them; an introduction to point and interval estimation; tests of 
hypotheses; and acceptance sampling. Furthermore, the need for 
planned experimentation becomes evident. 

It might be kept in mind that all of this can be discussed before be- 
ginning the scheduled demonstration. With the box of beads and the 


our statistical play. Our audience is well aequai ted with their human 
frailties and will not be too much stirprisedeit they misbehave. 

It does not seem necessary to fill in the details for the entire demon- 
stration, or to call attention to the opportuhities for problem and theory 
assignments based on it. It might be remarked, however, that опе per- 
formance of Steps 4-7 of the indicated procedure Provides & single 
sample of the operation of the control chart for fraction defective. 
Assuming that the data has been produced by а controlled process of 
unknown fraction defective, this sample ef one can be used to estimate 


recognize the possibility of such an error.) Alterpatively, the sample 
of one can be used to test some hypothesis regarding the probability 
of this error of the first kind.e 4 | 
The later parts of the indicated procedure provide other single 
samples of control-chart operation under various coralitions. 
samples could'serve the purposes indicated above OF they fould be 


population of all gepetitions of the complete procedure аз 00 
Parenthetically, tre ophou here for confusion M the eine: 
concerning sample size and the population sampled should be*no 
Granted that, as a demonstration, the quoted procedyye may dalled 
merit, the question remaind as to whether or not it should be 


combined with the first sample to provide a sample of one from the 


Ж 


sampling paddle as actors we have given reality to the characters of — 


» 894 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


an experiment. The answer seems to depend on the attitude of thein- ^ 
structor and students toward it. As indicated above, it is possible to 
plan and execute the demonstration as one or several experiments. 
Furthermore, each may be performed for exploration or to test some 
hypothesis. 


4. LABORATORY WORK 


It is my impression that much of the laboratory work in statistics 
consists of working problems involving a considerable amount of cal- 
culation. Practice in the choice and use of formulas is valuable and 
there can be no quarrel with the usefulness of learning how to operate 
calculating machines. The question arises, however, whether it would 
be possible to broaden laboratory work to include a few experiments. 

Tn a paper [3] presented at the Chicago meeting in December, 1952, 
Professor A. C. Rosander proposed a list of some forty-five laboratory 
experiments for probability statistics and suggested that these be or- 
ganized into a manual. The preparation of such a manual would seem 
to be a useful enterprise. It would seem difficult, however, to avoid 
loading the manual with experiments requiring an undue amount of 4 
mechanical repetition and uninteresting computations. Ў 

Industrial laboratories are beginning to be interested in statistically j 
planned experimentation. For this*work the cooperation of statisticians = 
skilled jp experimental désign will be sought. It would seem desirable 
to begin the training in design by work in the statistical laboratory. 
Therefore, a manual such as Rosander proposes ought to include exer- 
cises on designirig experiments as well as on executing them. 


= 
5. THE EXPERIMENTA, APPROACH IN THE SEARCH FOR TRUTH d 
ч 


3 Up to this point I have considered the use of the experiment as an [ 
aid in digesting the existing body óf statistical principles and methods. 
There is quite another aspect of the experimental approach which now 
deserves some attention. I refer to the use of experimentation in push- 
ing back statistical frontiers, 
* The ole of induction and 


VET generalization in the field of statistics is 


known, with the contribution of "Student" as a shining example. 
А Pose reasonable to expect that the study of s amples will continue 
n iem pem needed to puede gap between the experimenter 
AN 18 the greater availability of high-speéd computers experimental 
Sampling can be done on a scale undreamed of fifty years ago. A 


2 


ё 
. 
EXPERIMENTAL APPROACH IN THE TEACHING OF STATISTICS . 895 


Weldon? who rolls a set of 12 dice a total of 26,306 times yields his 
place in print to the machine which turns out a few million pseudo- 
random numbers while warming up for a really big job. The inductive 
reasoner no longer need restrict his operation to a few small samples. 
. As a footnote to this discussion of using the experimental approach 
in the search for truth, an additional remark might be added. Belief 
is an individual matter. Some students require one type of proof, some 
another. One of my beginners could not accept the mathematical der- 
ivation of the probability density function for the sum of a sample of 
two from a rectangular universe. After he and his, wife spent several 
hours drawing samples from a double deck of cards he was entirely 
satisfied with the truth of the theorem. 

This is, by no means, an isolated example. It is my belief that a con- 
siderable proportion of our elementary students have little faith (or 
interest) in mathematically established truths until they have seen 
experimental verification. One definition of “experiment” quoted above 
was 

“A trial made. . . to demonstrate some known truth” 


For the thousands of men taught statistical quality control in the past 
dozen years the trials made to demonsirfte the instructor-known 
truths often have séemed to be the only means of getting the message 
through to its destination. E 


э 
6. EVALUATION OF RESULTS e 


е 

My previous remarks regarding the experimental approach imply 
that it has some positive value. This opinion might well be tested by 
means of a designed experiment. Presumably any scientifically-minded 
educator who advocates а particular plan of оір is willing to give 
his pet theory an objective test. Statisticians, in particular, claim that 
they prefer to base conclusions on facts rather than opinions. * 

For our statistician-seedlings we ‘might have two sets of treatments. 
The first set, as discussed insSection 2 above, might consist of the ex- 
ploratory experiment and the experiment to test hypotheses. The sec- 
ond set of treatments might include the experiment bf* the instructor 
in lecture (with or without help) and the experiment by thé student 
in the laboratory. : d : 

The treatmenfe in the second set could be applied in various 
strengths. Either we could rush through the expe in efft, or we could 
provide various amounts of orientation before, during and affer per- 

“ сө = 


2 Weldon’s dice data are given in flap. 278). 


eo 


‹ 
896 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 16 


" formance. I have averred that the student learns little unless ti 
ground-work is carefully laid; but this is an opinion needing proof. 
Measurements of two kinds of results should be made on our sub- 
jects. In the first place, increase in knowledge of existing principles 
and methods should be gauged. Second, there should be a measurement 
of discovery-value. 
Finally, the planned procedure ought to provide for a control grou 
which receives no teaching by the experimental approach. 3 
It is not my intentich to extend the discussion of this test any fu 
ther beyond remarking that in order to plan and conduct it properly 
the cooperation of a professional in the field of educational tests and 
measurements probably would be necessary. 


7. CONCLUSION 


In the early part of this paper the opinion has been expressed that 
the experimental approach has value and a few suggestions have been — 
made regarding combinations of treatments for best results. In the 
last section it has been suggested that these hypotheses should be 


the earlier statements invthis paper, I am confident that the recom- о 
mendation for the collection of some unbiased evidence will receive 
support. 


: k REFERENCES 

Ш Hosa C., Protubility and Its Engineering Uses, New York, Van Nostrand 

[2] Olds, Edwin G., and Knowler, Lloyd A., “Teaching statistical quality control 

en and gown,” Journal of the American Statistical Association, 44 (1949), 

{3) Rosander, A. C., “The uscof laboratory experiments in teaching probability 
statistics” (unpublished manuscript). : 
[4] енен Gollegiate Dictionary, Fifch Edition, Springfield, Mass., Merriam 

А > " À Т 

[5] Working, H, and Olds, E. G., An Introduction to Statistical Methods of Qual- 

ity Control in Industry: An Outline of a Course of Lectures and Exercises. 

Quality Control Program, Carnegie Institute of Technology (1945). Pre- 

pared for Offce of Production Research and Development, War Production 

E Boarcpand now available at Government Printing Office, Washington, D. C. 


ш] b 0 


USE OF EXPERIMENTS IN TEACHING 
ENGINEERING STATISTICS* 


Irvine W. BURR 
Purdue University 


n а 91. INTRODUCTION 
° 


XTENSIVE usé of sampling experiments was one of the character- 
E istic features of the war-time intensive courses in statistical quality 
control [5], and is still an integral part of the present, short courses in 
the subject. For those with as little mathematical skill and under- 
standing as many industrial men have, derivations are out of the ques- 
tion, and the only recourse is to demonstrate the theory by sampling 
experiments, Industrial men are readily convinced by such experi- 
ments, especially when they themselves do the sampling. 

The situation would seem to be quite different when one is teaching 
engineering students who have had calculus. It is not as different, 
however, as it seems. Engineering students are intensely practical and 
few of them are fond of mathematical theory. Hence well chosen and 
carefully explained experiments afe а yalaable supplement to mathe- 
matical derivation: Even for a student majoring in mathematics, ex- 
periments can serve to illuminate the theory, and they certainly do 
give the average student a clearer idea of «statistical thinking.” 

The instructor should make clear (a) the principl& involved, (b) the 
comparison of the observed results with theory, (c) the necessity for 
careful arrangement for randomness, and (d) the fallibility of the ex- 
periment, explaining any “misbehaving” of the experiment. 

СЕЈ 
2. SAMPLING POPULATIONS * 


For fraction defective experiments a convenient way ‘to simulate a 
binomial distribution is to use а Vox of beads.’ One color, say white, 
can be called good, and another, say red, can be called defective. Then 
if we use a great many more beads in the box than in a sample, the 
actual hypergeometric distribution will give а good approximation to 
the desired bfhomial. It is convenient to maintain 1000 уе beads 
and then vary the number of red ones for various fractions—42 for 4 
per cent (42/1042), 87 for 8 per cent, ete. Samples of 50 seem to work 


well, A description of paddles for such drawing is givefi in part Ш of 
ж Presented at the Annual Meetipg of the American Statistical ‘Association at Washington, D.C, 
December 29, 1953. » " 
1 Suitable beads one centimeter inwliameter of varjous colore may ‘ob’ 


—— 
tained from Waco Bead 
Co., 87 West 87th St., Naw York City. : 


897 


898 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 195% — 


» 
an article by Olds and Knowler [4]. The relative error in approximating 
a binomial probability by a hypergeometric probability is approxi- 
mately [2] 
1 
— —— [d — (d — np’)? a 
Ny [a - ( ^], 

where N and n are the lot and sample sizes respectively, d is the num- 


t 
» 


TABLE 1 
POISSON POPULATIONS FOR SAMPLING EXPERIMENTS 


Number of Chips for Given Parameter 
No. of 
Defects с'=1 e-2 с =4 с =6 с'=8 
0 184 68 9 1 0 
1 184 135 37 7 1 
2 92 135 78 22 5 
3 31 90 98 45 14 
4 8 45 98 67 29 
5 29, 18 78 80 46 
6 6 52  , 80 61 
7 2 30 69 70 
8 END 15 52 70 
9 i б 7 34 62 
M 5 3 21 50 
11 1 11 36 
12 6 24 
13 à 3 15 
14 > 1 8 
15 2 o 1 5 
16 й 2 
17 S E 1 
18 5 ^ 3 1 
501 500 > 501 500 500 
Actual Average | 1.004 006 4.002 6.020 8.012 


2. 
Actual Variancs 1.006 2.042 3.982 6.036 7.916 
* n 


7 


ber of defectives in the sample, p’ is the fraction defective in the lot, 
and g'is 1—p’, ag › 

For the Poisson distribution, no such convenient method is available, 
other than that of making the double approximation of the hyper- 
Svunstric fo? the Poisson. The best possibility would seem to be to 


› 


4 


> 


4 


j 


P 


6 


é 
USE OF EXPERIMENTS IN TEACHING ENGINEERING STATISTICS 899 
LI 


use beads or chips, each numbered with a number of defects.? Then 
drawing one chip or bead gives a value of c, the number of defects. For 
such a Poisson population it is desirable to have at least 500 pieces 
so that there will be a few rare values of c available in the population, 
*in order that а c chart can go out of control. If too few chips, say 100, 
are in the population there probably will be no rare values of c—none 
can be rarer than 1 in 100° The populations shown in Table 1 are quite 
serviceable. It shbuld be noted that if a sample from a population with 
a larger parameter, с', is desired, this is readily available because of 
the additive property of the Poisson distributio. Thus, if we want 
с! —14, we can draw one chip from c' —8 and one from c'—6, and the 
total of the two c values will have the desired distribution. 

For normally distributed populations Table 2 gives a flexible set of 
distributions. These were used in the war-time courses and are still in 
wide use. Another convenient way to generate approximately normal 
populations is through using various numbers of dice, Although the 
distribution of the number of points on a single die is rectangular, the 
total number of points on 3 dice is fairly normal, as the following 
theoretical distribution shows: 


e 
Total 3 dice „3456 7 8 $1011 $$ 13 14 15 16 17 18 Total 
216xProbability - 136101521 25272725211510 6 3 1 216 


For n dice at à throw ‘we have the following theorgtical results for the 
total points: M o ° 


° 
3, SAMPLING EXPERIMENTS 


e " 
In any one course one would neyer use all of the experiments here 
suggested. Nevertheless it seems desirable to list them all. 


3.1. Sample Frequency Distribution 

A simple and effective demonstration of the binomial may be had by 
casting 6 dice and counting the number of aces appearihg on, gach such 
cast, One hundred casts will give an interesting comparison befween 
the observed frequencies of numbers of aces and the theoretical, from 
100 е +4)". The ehi-square test of goodness of fit may. be used, if de- 
sired, 


Y Washington 
1 White fiber chips may be bought from Lamb Seal and Stencil Co, 824 13th St., N.W., i 
D. C. However, if one can find the right indugtrial company he can obtainghousande of fiber qje* ^79 
since they are scrap in many processes. е . 


oe P 


900 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1i 


ом 
Binomial samples from beads may be taken with much smaller fi 
tion defective, thus illustrating better the typical situation in industri 
` lots. Samples from any of the populations in Tables 1 and 2, or 
_ totals make good examples fo compare theory and sample. 


TABLE 2 
APPROXIMATELY NORMAL POPULATIONS ‘ 


Frequencies for Population 
No. SEES SS 
A B с р Е 
—10 1 
9 1 1 
8 1 E. 
n 1 3 1 
6 3 5 34 
5 1 10 8 5 
4 3 23 12 8 
3 10 39 1 16 12 
2 23 48 3 20 16. 
-1 39 ЕЕ) » 10 22 20 
mx 48 23 23 23 22 i 
+1 39 10. 39 22 23 Es 
2 23 j 3 Ы 48 20 22 
3 10 ^ % 1 39 16 20 IRA 
43 З 23 ~ 12 16 n. 
5 1 10 8 12 p 
6 3 5 8 | 
i Я 9 1 3 5 
hi Oi I 1 Y 
0: 1 1 
+11 1 


aj 
E. 


8.2. Control Charts 


| ,,Elective use of the foregoing populations may be ihade to illustrate 
va o Rye, D, пр, and c charts. One may, take samples to compare against 
es center line and limits calculated from the Rnown population charac- 
Р teris cs. But perhaps more interesting is the case of “no standard 


> 


@ 


> 
udu or EXPERIMENTS IN TEACHING ENGINEERING STATISTICS 901 
a € 
= given." A series of preliminary samples are drawn and control lines 


figured. Control is checked, the lines continued, and new “production” - 
analyzed. The population can then be changed by increasing p' for 
fraction defective, or by using a different measurement or Poisson 
population. Thus one might use Population A of Table 2, then shift to 
B or D, or use c'—4 in Table 1, then shift to c’=8-or 1, and observe 
the effect оп the chart. © * Г 

* Other interesting experiments have to do with ways in which samples 
can be taken [5], [2], [3]. For example, a stratifidd sample may be taken 
by letting X1 be the total points for 2 dice, X2 thatefor 3 dice, X; for 4 
| dice, X, for 5 dice. Then the first Xs, Xs, Xs, and X together comprise 
the first sample, etc. Such a sample is stratified because it contains 
one value from each of 4 populations. The range for such samples will 
be so inflated by the difference between means, especially X; and Xs, 
that both charts for X and Ё will show “too good" control. On the 
other hand if the first four Ху from the same data are taken in one 
sample, then four Хз, ete., and all the data put on one X and R chart 
the X/'s and X/s will stand out beyond the control limits like “sore 
thumbs,” because they are out of control with respect to the rational 
sample variability. The same experiment: máy be readily done with 3 
distributions A, oné of B and one of C. Such populations can then be 
mixed, and samples drawn, illustrating *how random samples from 
mixed product show good control despite the presence of assignable 
causes. M e 


3.3. Significance Tests 


Any of the standard significance tests,may be made by using asam- 
ple from one of the populations. Thus one „cân test the hypothesis 
that X’=0, using either population A or D, and assuming either that 
с' is known or unknown. Or one can test the same hypothesis against 
populations B, C, or E. An interestirfg variation is fo let each class mem- 
ber make the test by drawing his own sample. In this way one would 
expect about one class member in 20 to refute a true hypothesis, at the 
5 per cent level. Also the class can thus build up a t, оге non-central t 

P 


distribution. * : 


8.4. Estimation H 

The subject of biased and unbiased estimates of population parame- 
ters may be illustrated by drawing a series of small samples and tabu- 
lating the various kinds offestimates, say, of both population standard, 
deviation and variance. te . 


Ü 
" 902 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


Confidence intervals may be set from each of a series of samples. 
There is à beautiful illustration of this kind of experiment given by 
Shewhart [1]. It is advisable to use 90 per cent confidence intervals so 
that there is a reasonable chance for an interval to fail to contain the 
parameter. 


р 


8.5. Analysis of Variance ә 


» 

Obviously one can ryn an analysis of variance with all cell samples 
from distribution A, to illustrate the null hypothesis. Data for simple 
designs can be repeatedly drawn and calculations made to yield a dis- 
tribution of F values. Then one or two cell samples can be drawn from 
B or C and thus the null hypothesis stands to be rejected some of the 
time at least. (Of course one can more easily illustrate the formation 
of an F distribution with pairs of samples from A, or say from 4 dice, 
and the non-central F by a sample from A and one from D, or one 
with 4 dice then one with, say, 10 dice.) 

Less obvious but equally valuable for illustration are experiments 
where the true cell mean is determined from any desired linear hypothe- 
sis, and then a random grror component is added, such as a drawing 
from distribution A, or a toss of 3 dice. A linear or quadratic trend 
can readily be simulated. Such trends must be strong relative to the 
random error, unless large samples are to be drawn at each point. 

Tests for homogeneity’ of variance, such as Bartlett’s, and homo- 


geneity"of proportions, such as chi-square, also can be readily illus- 
trated. 


8.6. Statistics of Combinations 


One good experiment to illustrate the tolerances for mating parts is 


often worth any amount of equations for the practical man. A success- 


ful one is the following: » 


Outside diameter of shaft 


» 


X’ = 3.00105", о’ = .000296”. 


ThroW 3 dice, and regard the number of points as'the number of 
ten-thousandths of an inch in excess of three inches. 


Inside diameter of bearing e 


Ево a’ = 000418”. 
2 


Throw 6 dice, 


and regard the n of poi f 
Р оао 2 umber of points as the number о 


of an inch in excess of three inches. 
mo 


> 


@ 


D 
UËE OF EXPERIMENTS IN TEACHING ENGINEERING: STATISTICS 903 


Draw the two frequency distributions, the 3c limits for each and the Е 
true extreme possible values for each. The percentage of interference is 
only about 2.6 per cent as may beillustrated by experiment, but guesses 
prior to experiment, based on the two frequency distributions shown, 


the extreme ranges, would never sanction such distributions, even 
though еу would probably be satisfactory in practice. Hence such 
pérsons would specify a considerably greater mean difference, thus 
giving too many loose fits. * 
3.7. Acceptance Sampling y 

A wide variety of sampling experiments are regularly used to show 
the way various sampling inspection plans operate on "acceptable," 
“marginal,” and “rejectable” material, or on mixed product. It can be 
shown how the consumer is protected against bad quality, on a lot-by- 
lot basis and on an average-quality basis. An empirical operating char- 
acteristic curve, showing the observed probability of acceptance for 
given quality may be built up. Not only attribute plans, but also vari- 
ables plans may be illustrated in operations. For other possibilities,see 
the original manual [5] and current manualg fer short courses in quality 
control, such as those at the Universities of Iowa, Illinois, and Michi- 
gan. " . 


* 
° 


3.8. Sequential Analysis е 3 > 

Experiments аге especially effective in showing the way in which 
sequential analysis reaches its decision. The same plan can be tried on 
different populations, some “goo” and others “pad.” Any of those 


listed in Section 2 can be so used. bad 
° 


3.9. Linear Correlation o 


One can easily illustrate sampling from an untorrelated population 
by drawing independent random samples from the pair of populations. 
Tor example, let Y be the total points for a throw of 3 dice, while X is 
a drawing from population A. To illustrate à correlatedgpopulation, one 
can draw a chip from A, and then use this as а correction toewhatever 
one gets in a toss of 3 dice. Thus if a “minus 2” is drawn for X, and the 
throw yields 18, we record 11 for Y, ete. For this case, we have г = ‚5018, 
and а góod approximation {о а normal bi-variate population. It can 
in fact readily be shown that .„ ° 

e 


r °А/тх*/(тхё F 027) 


s? 


ә 


Fi Е 0 
904 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


. where сх? is the variance of the X values, and cz? the variance of the Y 
values before given the adjustment from X. 

If weaker population correlations are desired, we may throw more 
dice for the original Y value thus increasing oz?. Stronger correlations 
than .5 are obtainable through using distribution D for X and, say, 3 
dice for the original Y, or else letting X be a drawing from distribution 
D, and Y this drawing plus a drawing from A. The respective values 
of r are .7608 and .8965, ; 

D 


8.10. Curve-fitting » 


In curve-fitting one can take data from an exact mathematical curve 
at equally spaced X’s, and find one Y value for each such X, by taking 
the mathematical value and adding a random error. The latter should 
be small relative to the variation in the mathematical curve if a close 
relationship is desired. The formula in Section 3.9 can be used to give 
the true correlation ratio for the non-linear population. Thus we have 


V. ow? 
рә ИТЕ 
ту? + oz? 

D 1 
where ow? is the variance Of the true curve ordipates and cz? is the 
variance of the error component added to W. j 


8.11. Research Protlems » 


Frequently a research worker is unable to find the distribution of 
Some statistic. Recourse may then be had to samples from a suitable 
population. For example, an engineer in design wanted the frequency 
distribution of a compgunding of eccentricities at random angle. Thus 
he wanted the distribution of Z in 


to # = X* Y +4 2XY cos 0 

where the distributions of X and Y were known and @ was rectangularly 
distributed 0° to 180°. The approximate miean and variance were found, 
but а samplin ,experiment was needed to determine the approximate 
Shape of the distribution. s 

A typical class experiment would be to find empirically the sampling 
distribution of the standard deviation for a skewed or a J. -shaped 
population. И 9 A 


" 
4. CONCLUSION 


Xie foregoing experiments could form the bulk of the theoretical 
Scussion of a course or could*be used only as an occasional supple- 


* 


è 


рён OF EXPERIMENTS IN TEACHING ENGINEERING STATISTICS 905 


ment to derivations. If carefully explained and interpreted relative to xi 
some practical situation, such experiments can be most helpful. 
г REFERENCES 
1 ASTM Manual on Quality Control of M. aterials, American Society for Testing 
Materials, Philadelphia, 1951. See p. 45. 
2] Burr, kving W., Engineqing Statistics and Quality Control, New York, 
a McGraw-Hill (1953). 56 р. 209 for correction formula, рр. 274-80 for kinds 
of sampling. 7 e 

3] Burr, Irving W., *Some experiments illustrating? principles of quality con- 
trol," Quality Control Report 12, Office of Production Research and Develo- 
ment, War Production Board, Washington, D. C., September, 1945. 

4] Olds, Edwin G., and Knowler, Lloyd A., “Teaching statistical quality control 
for town and gown,” Journal of the American Statistical Association, 44 
(1949), 213-30. A 

5] Working, Holbrook, and Olds, Edwin G., An Introduction to Statistical 
Lectures of Quality Control in Industry: An Outline of a Course of Lectures and 
Exercises, Office of Production Research and Development, War Production 
Board, Washington, D. C., April 1944. Presently available at U. 8. Govern- 
ment Printing Office. 


9 Phá 
© 
2 . 
о 
e 9 
". Б 

е 

2: . 
a 
° 
* ° 
° Ы . 
° 
« 
° е. 
е 
. о « 
Й 
е 
е 
e LI е є € e 
* . 
se ia 
‚н 5 
^ 
е . 


D) 


ERRATA 


Readers and authors are invited to submit corrections to papers 
published in any previous issue. These will be published each year, in 
the December issue. f 


Grab, Edwin L., and Savage, I. Richard, TABLES OF THE EXPECTED 
VALUE or 1/X ror Posrrive BERNOULLI AND PorssoN VARIABLES, 
Vol. 49, No. 265 (Marqh 1954), 169-77. 

The following paper has recently been drawn to our attention: J. Ti- 
ago de Oliveira, “Sur le calcul des moments de la réciproque d’une 
variable aléatoire positive de Bernoulli et Poisson,” Anais da Faculdade 
de Ciências do Porto, 36 (1952), 5-8. 

The material in de Oliveira’s paper suggests alternative methods for 
computing the tables of our paper. 


Laderman, J., Littauer, S. B., and Tukey, John W., Tux INvEN- 
TORY PROBLEM, Vol. 48, No. 261 (December 1958), 717-732. 
The following corrections should be made: 
2 


Page Lii р Reads „ Should Read 
728 12 (2nd display line) fordXy. fordzy. 

726 16 / D henec hence 

О0о агй 2535 225 


Roshwalb, Irving, Errecr or WEIGHTING BY CARD-DUPLICATION ON 
Errictency or Survey Resuuts, Vol. 48, No. 261 (December 1953), 
773-777. re 

The first term within the brackets of expressions (3) and (6) should 
read 4n(P—N)/(P— 1) instead of 4n(P—n)/(P—1). As a consequence, 
the coefficient of r iif the denominator of (4) should be (1--4)° instead 
of (5d* —2d-F-1). These changes do not affect the tabulated relative in- 
formation in the case of sampling with Teplacement and have no sig- 


nificant effect except in the case of high sampling rates. 
SUA Eee cm e 


Savage, L. J., Tan THEORY or SraTISTICAL Decision, Vol. 46, No. 253 
(March 1951), 55-67. Ё э 


On page 60,"in lines 19, 32, and 36 change $1 to $1/2; and in line 36 
a 


i change $2 to $4. 


» 
> us. i ә е 
у ° 


to 


^ 


e 
© 


STATISTICAL ABSTRACTS 


АП communications concerning 
the Abstracts Editor, Professor 
the Department of Statistics, 
Hill, North Carolina. E 

o 


Ardian, Leo A., “What, Makes a Quality 
Control Chart Tick,” Industrial Quality 
Control, 10 (1954), 38-43. 


The concept of errors of type I and type 
II are presented for control charts based on 
fraction defective and number of defectives. 
Tables and charts are included showing the 
relation between the two types of errors as 
functions of selected sample sizes and qual- 
ity levels. Two probability models are con- 
sidered when the process is out of control: 
(1) it is operating at & new constant level, 
(2) it is operating at a different level at each 
decision point. Сева» J. LIEBERMAN, 
Stanford University. 


Borch, Karl, “Effects on Demand of 
Changes in the Distribution of Income,” 
Econometrica, 21 (1953), 325-31. 


© 

This paper reports the results of a par- 
ticular attempt to lend a more explicit 
economic interpretation to effects fre- 
quently attributed to “time” variables in 
econometric analyses. The particular em- 
pirical results on which this further analysis 
is based are those of Prest (The Review of 
Economics and Statistics, 31: 1, 1949), a8 
revised by Farrell (Econometrica, 20: 2, 
1952), for beer, spirits, and tobacco. Prest 
analyzed demand for these three goods in 
the United Kingdom, using time series data 
for the period 1870-1938, omitting 1915- 
1919. The basic form of the demand relation 
used by Prest was, C, kY ePéethifeto, 
where C and Y are, respectively, consump- 
tion and income per capita, P is price de- 
flated by a cost-of-living index, ¢ is пе in 
years, and z is a discontinuity variate tak- 
ing value 0 for 1870-1914 and 1 for 1920- 
1938. In Prest’s regults the time variables 
in the exponent were dominant variables in 
“explaining” the variation in consumption 
over the period. The present author's hy- 


changing income distribu- 


nomic factor underlying the highly signifi- 
cant coefficients of the 

Prest’s results. 

The author presents results of ә an 

lar formulation consistent with is hy- 
P 


George E. Nicholson, Jr., 
University of North Carolina, Chapel 


this section should be addressed to 
Chairman of 


pothesis. The form of the income distribu- 
tion function ig assumed to be the logarith- 
mic-normal айй income elasticity of con- 
sumption is assumed to be represented by 
E(y)=p+q/log y. In this framework the 
pattern of change in income distribution 
over time which would approximately ac- 
count for the effect of the trend found by 
Prest is determined. Taking Farrell’s re- 
vised estimates of the coefficients a, c, d, 
and f as given, the coefficients p and q in 
the income elasticity expression are ap- 
proximated. Having values for p and а, the 
coefficient of variation of the income distri- 
butions at selected points jn time are cal- 
culated. The results, apart from extreme 
values for spirits in two years, reflect a 
marked trend toward equality in the distri- 
bution of ingome. On a priori grounds this is 
= сопвійег@ an acceptable development of 
the income distribution. The р and q ob- 
tained аге also used to calculate income elas- 
icities corresponding to different levels of 
income. Iti concludgd that theresultingelas- 
ticities do fot generglly conform to what 
would be expected in| 
clusion of the author is the following: “One 
should not draw overly general conclusions 
o from the rough caleulations in this paper. 
The results seem, however, to indicate that 
changes in they distribution of income can 
play an irfiportant part in explaining time 
trends in demand functions." Ivan M. LEE, 
eUniversity of California. a 


Brown, T. M., “$andard Error of 
of a Complete Econometric 
Econometrica, 22 (1954), 178-92. 

The author develops and presents in 
matrix form approxima’ formulas for the 
estimation of the elemen! of the vector of 
the standard error of Хогесаві оѓ the en- 
dogenous variables in & multiequation 

framework is de- 


‘uitively. Afmain con- 


Forecast 
Model,” 


tions of 
the single equation case, а genera? expres- 
sion for the forecast variance is developed as 
a sum of two components} Viz., the. vari 

ance of the estimated mean of Y for given Z 


907 


908 


: ‘and the disturbance variance. This may be 


expressed in matrix form as, o*(Yr) 
= 2р [e (a) ]Zr' --a?, where Zp is the vector of 
predetermined variables assumed known) in 
the forecast period, e(a) is the covariance 
matrix of estimated coefficients, and а? is 
the disturbance variance. In the single 
equation case, с(а) and e? are estimated by 
well-known methods. 

In the multiequation complete econo- 
metrie model, the above forecast variance 
becomes a vector containing,as many ele- 
ments as there are endogenods variables in 
the system. Each element of this vector 
may be viewed as the sum of two com- 
Ponents analogous to those specified for the 
single equation case. The multiequation 
complete econometrie model may be ex- 


. pressed, 8Y'--TZ'—AX'—u,' where Y, Z 


and pp are, respectively, vectors of en- 
dogenous variables, predetermined vari- 
ables, and unobservable disturbance terms; 
B and T are population coefficient matrices; 
X= [YZ]; and A= [8T ]. Let the estimated 
model be represented by BY'--OZ/— AX’ 
7uy', where B, C, and Up are estimates of 
B, T, and py, respectively. Written in a 
form most. suitable for forecasting Y for 
given Z, the estimated model bocomes, У” 


= BACZ! Вт, = FZ'--u,-uy hve been ^ 


| designated “partial residuals” and и, “total 


-cast reduced form” 


residuals." 'The system expressed jn the 
above form is referred to as the “forecast. 


reduced form.” Let аж be a vector contain- 


ing all nonzero, nonupit eleméhts of es- 
timated mutrix A. Then, since F= -B230, 
each element /;; of F is a function of the 
elements of a*. A forecast of the ith en- 
dogenous variable (Yi) can be ealculated 
from the ith equation of the above “fore- 
system, By analogy 


with the single equation developtnent, the 
AD forecast variare is the sum of 
Wo components and may be written, 
SI Fir) = Zr (SU) ]Zr'+ Sy, where Zp is 
the vector of “known” values of the prede- 


- termined variables in the forecast period, 


807) is the estimated covariance matrix of 
the estimated coefficients f;, and S,;* is the 
estimated variance of “total residuals” Mat 
in the ith equatior? In the procedure out- 
Ишей, Sait ah eae ari from the ex- 
Pression, Si?= (1, —fi)(Mx;x,](1, —f,)’, 
where (1, ~f,) is the coetficicat vector d 
xix; 18 the moment matrix of variables 
PRU DA i. The nne S(fi) are ob- 
: rom, | Sf) = [д/;/да® |S(a*) [д, 
Jào* ' where ду,/да is the асос ny Я 
nd fi with respect to a*, and S(a*) 
3, the cov matrix of the estimated 
sltuctital cosfüclents ий. he elementa, of 


_ S(a*) are estimated from the negative in- 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER | 


2 estimate of А, and the expressions for the 


" In the case of the exponential distribution. 


„tistage designs of any degree of complexity 


€ 


verse of the matrix of second order р: 
derivatives of thelogarithmic likelihood fun 
tion. The elements of both [df;/da*] 
S(a*) are evaluated at the point of 8 
estimates a*. 
In conclusion, a few remarks are of 
with respect to degrees of freedom in: 
samples, confidence versus tolerance 
vals, and reduced form equgtions (608 
ficients of the reduced form estimated 
rectly by least squares) versus the “for 
reduced form” as a basis for fore 
The suggested rules in the case of smalls 
ples are taken directly by analogy from: 
single equation case in which Markoft 

ditions are met and are recognized as p 
sibly quite inappropriate to the mult 
equation model. A minor error is noted in 
the correction factor in equation 31, which, ~ 
by analogy with least squares, should n 
(Т/Т— т)! Ivan M. Len, University of 
California. 


Downton, F., “Least-squares Esi 
Using Ordered Observations,” Annals 
Mathematical Statistics, 25 (1954), 303-16, 
Ordered least-squares estimates are ob- 
tained for a class of 2-parameters distribu- 
tions of the form f((x— и) /с}/с. The gen- 
eral expressions for the quantities to evalu- 
ate the estimates ôf и and с are given for this | 
case. In any Special case, substitutions can 

be made in the general formulas to give the 

Particular estimates. The ordered least- 
Squares estimates of 2 distributions, viz. the 
rectangular and the right triangular, are 
given as special cases. The expected values, — 
the variance matrix, the estimates and | 
their variances are calculated for the latter 
distribution for samples up to size 10. Fur- 
thermore, the single parameter system of 
the form f(z/X)/A is considered and from . 
this is derived the ordered least square! 


quantities needed to calculate the estimate. — 


(2) = € 7/^/^, the estimate of ^ is the sample 
mean.A Pearson Type III distribution, de- 
pending upon a single dispersion parameter, _ 
is discussed and the ordered least-squares 
estimate turns out to be identical with the 
maximum likelihood estimate. A. E. SAR- 
HAN, University of North Carolina. 


Durbin, J., “Some Results in Sampling 
Theory when the"Units Are Selected with 
Unequal Probabilities,” Journal of the 
Royal Statistical Society, Series B, 15 
(1953), 262-69. 

A rule is given for calculating estimates 
of sampiing error that can be applied to mul- 


о 


° 
© 
STATISTICAL ABSTRACTS 


and which is an extension of the rule given 
by Yates for the case of equal probabilities 
of selection. The relation between the theo- 
ries of sampling with and without replace- 
ment is discussed and two approximate 
procedures are described which are easy to 
“apply but lead to slight overestimates of the 
sampling error. T. 8. RusseLL, Virginia 
Polytechnic Jnstitute. 


о 

Epstein, B., and Sobel, M. «gome Methods 
Relevant to Life Testihg from an Exponen- 
tial Distributiou,” Annals of Mathematical 
Statistics, 25 (1954), 373-81. 

The authors considered that they have 
N items for life testing divided into k 
set S;, each containing nj, based on the 
two parameter exponential distribution 
(1/0 6-00, А525 о. Each set is ob- 
served until the first r; failures occur. Three 
different cases are considered according as 
the n; items have a common known or un- 
known Aj, or the N items have а common 
unknown A. Some preliminary lemmas and 
corollaries are given concerning the r-or- 
dered observations out of n based on the 
given exponential distribution. The maxi- 
mum likelihood estimate of 0 in the three 
cases is obtained. The article shows that 
the random variable 2R 0/0 (R observed” 
278) is distributed as „с (2R), x(2R — 2k) 
(k sets), and x*(2R— 2) in ihe three cases 
respectively. For the case in which the N- , 
items have a common unknown A, the 
confidence limits for 0 not involving A, and 
for A not involving 0, have been worked 
out. They show that the three cases are 
equivalent to assuming that the life dis- 
tribution in the various sets will plot in the 
form of straight line (s) whose slope (s) can 
be estimated by their results, A. E. SARHAN, 
University of North Carolina. 


Gartaganis, Arthur J., * Autoregression in e 
the United States Economy, 1870-1929," 
Econometrica, 22 (1954), 228-43. » 
Correlograms for 1 to 6 lags were ob- 
tained for each of 83 economic series taken 
from Burns (Production Trends in the 
United States Since 1870). The series were 
classified in agriculture, mining, and manu- 
facturing sectors, also similar to those de- 
fined by Burns. The time series covering the 
period 1870-1929 were broken into two 
periods (1870-1913 apd 1914-1929) for 
analysis amd compagison of autoregressive 
structure. Mean correlograms were con- 
structed for each sector and for each periqd. 
Results of approximate tests,e based on 
“Student's” t, for differences in pean spto- 
correlations between time periods within 
sectors and betweer sectors within time 


909 
€ 
periods for each lag are reported. Significant 
differences, “with minor exceptions,” are 
reported between time periods within seg- 
ments. A few significant differences appear 
between segments within time periods at 
several lags, but here the differences are 
not so marked. Autocorrelation coefficients 
for a few series with and without trend ad- 
justment (deflation by population series) 
are tabled for comparison. i 
Although autoregressive structures are 
not calculated, it is asserted that graphic 
appraisal of the series during the 1870-1913 
period indicates that they are evolutive. 
For the 1914-1989 period, autoregressive 
structures are determined for the agriculture 
and mining sectors from the mean auto- 
correlations calculated. The roots of the 
characteristic equations of these structures 
suggest that the autoregressive systems 
are stationary. Because of “heterogencity” 
the manufacturing sector is disregarded in 
this latter analysis. From the mean auto- 
correlations and the autoregressive C0- 
efficients, approximations to mean autocor- 
relations for additional lags are obtained for 
agriculture and mining. The correlograms 
dampen and finally vanish. This suggests 
either а moving average or autoregressive 
type o@structure. The author ventures the 
opinion that the structures are autoregres- 
sive. The author's main conclusions are 
summarized as follows: “(1) The auto- 
regressive structuge of the economy for the 
period 190-1913 giffers from that of the 
period 1914-1929 riod. (2) @reutt’s hy- 
pothesis (Journal of the Royal Statistical 
Society, Series B, 10: 1, 1948) that Tin- 
bergen's series of the 1919-1931 period can 
be considered as a sample drawn from & 
population having the autoregressive struc- 
ture, y.y 3ycst én is not em- 
pirically sul ritiated by our sample.” The 
alternative hypothesis is stated that for the 
similar period, 1914-199), the American 
economy can be considered as a population 
having different underlying autoregressive 
structures. Two of these have been esti- 
mated in this paper. They are for agricul- 
ture and mining, respectively: х= 2074211 
— 0240z.-3— 05612, 48 .01922, 43.040326 
— 109362161 €t, and — €4-.3054z.3 
4-.2582z,-2- .13842,.3—.25092,.4— 0798215 
+.1831z sté IVAN M. Loe, University 
of Galifornia. 


Gulliksen, H., “A leas#squares solution for 
successive intervals assuming, unequal 
standard deviations,” Psychometrika, 19 _ 
(1954), 117-39: |g re 
„А least squares solution for ffe scale 
values obtained by using the method of 


910 


"Buccessive intervals for the basic observa- 
tional data is derived. The theoretical solu- 
tion depends upon solving simultaneously 
for the scale values (m;), the discriminal dis- 
persions (з;), and the category boundaries 
(to) which will minimize the quantity 


(1/0)242 (sizig+mi—t,)?, 


where zi; is a normal deviate correspond- 
ing to an observed proportion and b is an 
arbitrarily assigned standard deviation for 
to. 

Numerically the direct lefist squares 
solution is laborious; methods for simplify- 
ing the computations ave presented. A 
series of numerical examples compare the 
relative accuracy of scales obtained from 
various computational procedures, В, J 
Winer, University of North Carolina, 


Gurland, John, “An Example of Auto- 
correlated Disturbances in Linear Regres- 
sion,” Econometrica, 22 (1954), 218-27. 
The author investigates the loss of effi- 
ciency of estimators of the regression param- 
eters when there are certain types of 
specification bias concerning the disturb- 
ances. Let у= ш (i91, 2, +++, n), 
where i; the expected value of vi, is a 
linear combination, i= Atit Obaj +. + 
FOkitk-ii +6, of the unknown param- 
eters 0, and Еи;=0. The covariance 
matrix of disturbances u; (Q) consists of 
elements Eujuj=o%w;;. It is assumed that 
the z's are “fixed” varidtes andsthat the 
elements wig are known? The disturbances 
are assumed to be generated by a first-order 
Markoff process, Ui—puri1=%, where 
Su SUA S EE "ydp dps se 
n), Hu=0, E» -—0 (tt), Ev, 2=0?, 
@=—М, аф, рр, йот 
n). DE S value" u_y i$ ed by 
U_n = 60_y, where the value, of ô is 
arbitrarily, Define on?= eee 
її, If the true values of P and 6 are 
known, the best linear unbiased estimates 
of the parameters 8 may be obtained by 
means of the transformation =y: 
= рл, UE it = TM — p3h t (£2, 
Hep Oh Weed 2 «E with a 
MON i — bow, and zi = 2)/gy and solv. 
ing the k т equations 9/8, [1/gy? (y, 
=0. If incorrect values 
or both, the estimates, 
will no longer be “best” 


=EN HZ (81—n,)2] 
are used for p or à, 
although unbiased, 
in general, 


р) 


7 
AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952 


close to 1. Assuming that the true value of 
p is known, the author investigates the loss 
of efficiency from unjustifiably omitting 
the term 1/gy* (y; — &1)°, that is, assuming 
gn extremely large when in fact it is not. 
For the case k=2, the author derives the 
limiting expression for the joint efficiency 
of the estimated regression parameters 
under the incorrectly specified gy (denoted 
by g*). Itsis then shown that “there exist 
values of т and g* in one case and values of 
p and g* in another for which the efficiency 
is arbitrarily close to zero. Limiting ex- 
pressions derived hold also in the case of 
evolutionary series for us. Three interpre- 
tations of the assumption that g* is very 
large are given. 

Also investigated is the possible loss of 
efficiency in assuming u; to be a stationary 
process when, in fact, it has an initial fixed 
value u_y=0. From the expression de- 
rived for joint efficiency, it is concluded 
that this incorrect specification could be a 
source of the considerable loss of efficiency 
of the estimates obtained by Cochrane and 
Orcutt from their series designated by (B). 

In an appendix, the joint efficiency of 
estimated regression parameters is derived 
for the general case of incorrectly specified 
‘disturbance covariance matrix. A minor 
notational omission appears on page 226 
where, in the covariance and joint effi- 
ciency expressions, 0 should read 0°. Ivan 


® M. Lee, University of California. 


Guttman, L., “Some Necessary Conditions 
for Common-Factor Analysis,” Psycho- 
metrika, 19 (1954), 149-61. 

One of the fundamental problems in 
Gommon-factor analysis is: given a matrix 
of sample intercorrelations R having units 
in the diagonals, to find a diagonal matrix 
U;* such that G;— R— U;* is а Grammian 

«matrix of minimum rank. Three theorems 
which give lower bounds on the rank of G 
sare déveloped. 

Let s; be the number of latent roots of 
Gi which are greater than or equal to unity. 
For any matrix 072 leaving G; Grammian, 
the minimum rank of Gi is shown to be 
greater than or equal to sı. If U;? has as its 
jth element the multiple correlation of 
variable j with all other variables, then the 
resulting G» will have minimum rank equal 
to or greater than зз. If Us? has as its jth 
element the highesi9zero order correlation 
with the other variable’, then the resulting 
бз will have minimum rank equal to or 
greater than s; The three lower bounds 
for G сап be ordered as follows: r=s2=8 
281? In practice s, will generally be the 


А LES 


ue 


ene 


@ 


e 
Ф 
STATISTICAL ABSTRACTS 


simplest lower bound to compute. B. J. 
Winer, University of North Carolina. 


Hotelling, Harold, “New Light on the Cor- 
relation Coefficient and its Transforms,” 
Journal of the Royal Statistical Society, 
"Series B, 15 (1953), 193-232. 


This paper presents parts of the deriva- 
tion of the«listributior of т in a pew form. 
A method of obtaining the® probabilities 
sssociated with the elementary case of the 
population correlation p being zero, for 
Jarge or for small samples is given without 
the use of tables other than of logarithms. 
For p#0 the slowly convergent series of 
incomplete beta functions for the prob- 
ability integral given by Pearson is re- 
placed by а rapidly convergent series of 
such functions. 

The moments of r about p and the mo- 
ments of z are calculated by a new method. 
This paper also examines the possibility 
of improvement over the use of z. Certain 
points in the mathematical theory of cor- 
relation coefficients are simplified to make 
more feasible their inclusion in future 


Cuype Y. KRAMER, 
Institute. o 


Bryon, A. Hughes, 
and Interpreting Physical Measurements 
of Groups of Children," i 
of Public Health, 44 (1954), 766-74. 


The article points out how three БЫ? 
tistical techniques can be valuable tools 
in analytical surveys of physical measure- 
ments of groups of school children. First, 
analysis of covariance should be used toe 
adjust for variables (e.g. age, height) hich 
can he measured but not often controlled? 
in sample selection. The second tool dis- 
cussed involves consideration of statistical 


variate mathematical model to 
studying the joint effect 
factors. Ig the appgndix, the author men- 
tions some nonsampling errors that fre- 
quently are not taken into consideration by 
workers in anthropometric nutritfbn 


research. BERNARD G. GREENBERG, Uni- 
versity of North Carolina. e 
oe x 


911 


King, E. P, “Probability Limite for the 
Average Chart When Process Standards 
‘Are Unspecified,” Industrial Quality Con- 
trol, 10 (1954), 62-64. 

New control limits for X are presented 


o/Vn 

where m is the number of subgroups, т is 

the subgroup size and i=1, 2,:::,m. 
The contaol limits proposed are X +CR 


where 
C ks / (202). 

These limits provide "short run" X 
charts with approximately a constant type 
Ierror. A graph of C factors is included for 
subgroup sizes or 2, 3, 4, 5, and 10 and for 
sample numbers of 3 through 25. The с 
factors presented are approximate, GERALD 
J. LIEBERMAN, Stanford University. 


Klein, L. R., and Mooney, H. W., ^Negro- 
White Savings Differentials and the Con- 
sumption Function Problem,” Economet- 
rica, 21 (1953), 425-56. 

Data from the 1947, 1948, 1949, and 1950 
Finances form the 


classes while the 
versed in the higher income 
Sufvey data for the North are 
relatione suggested by the Consumer Pur- 
chases Study of4935-30. The analyses sum- 
marized here bear on propositions advanced 
by the present authors find others previ- 
“explaining” 


of nonfarm, npnbusiness 
rved as the basis for an- 
which the characteris ig 


homeownership, Was repfesented Є а sepä- 
> 


912 . 
"rate variable in the analysis. Measures 
used directly or in construction explanatory 
variables were; disposable income, liquid 
asset holdings, number of persons in spend- 
ing unit, age of spending unit head, and 
lagged disposablé income. The residuals 
Írom regression in each equation were 
averaged for Negroes, whites and others. 
The mean of residuals for Negroes in each 
equation was positive and larger than for 
whites, The results were presented as sug- 
gestive, recognizing that the racial differ- 
ences in mean residuals are notstatistically 
significant. 

Another main section “of the paper re- 
ports results of variance analyses of mean. 
savings-ratio deviations. For each regional- 
racial group, the deviation of each spending 

~ unit's savings-income ratio from the mean 
ratio of its ineome class was calculated. In 
selected analyses, year served also as a sepa- 
rate variable, while in others data for the 
four years were combined. Additional vari- 
ables are-then introduced on the basis of 
which the savings-ratio deviations are 
further classified. The mean of the devia- 
tions falling in each cell of the resulting 
multiple cross classifications is the random 
variable analyzed in a factorial design with 

one observation per cell. 2 

Among the additional variables on which 
classifications for the several variance 
analyses are based are: (1) liquid ‘asset 
holdings, (2) past income change, and (3) 
job security. Significant main efféüts of the 
first two of the abové’ variables are re- 
ported as well as significant interactions of 
one or more of these variables with race, 
region, and/or disposable income. For sev- 
eral of the tests, data giving rise to signifi- 
cant interactions are reproduged to faéili- 

tate interpretation. Finally, refefence is 
made to variance analysis vesults with cer- 
tain other variables, although the results 
are not reported iil detail. 

A supplemental device efhployed in the 
paper is ‘the presentation of Percentage dis- 
tributions of spending units with respect 
to the several variables introduced by re- 
gional-racial-income classes. A brief section 
presenting and disctssing the implications 
cà а peyrentage distribution with re- 
_Врес& to eredit use appears in a final section 
of the text of the paper. Ivan M. LEE, 

University of California. $ 
1 O., “Onathe Transition Prob- 
{ onding to Acciden: 
Distribution,” Journal of Pede cacy 
Series B, 15 (1953), 87-89, 
dents in a fixed exposure time 7. the ae 


D 


7 


D 
AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


pected number of other accidents sustained 
by a person who has had z accidents is 
shown to be the ratio of the (x+-1)th tò the 
ath factorial moment of the distribution. 
The limiting value of this ratio when the 
exposure time tends to zero gives the transi- 


tion probabilities. From the form of the fre-? 


quency distribution” the transition prob- 
abilities are derived. G. I, Еровтт, 
Virginia Polytechnic Institute. 


4 9 
Lukacs, Eugene, “Оп Strongly Continuous 
Stochastic Processes,” Sankhyd, 13, Part 8. 
(1954), 219-28. 


The first theorem is concerned with the — 


normality of increments of a strongly con- 
tinuous stochastic process. The proof makes 
use of the e, 5 definition of strong conti- 
nuity. Various properties of strongly con- 
tinuous processes are then derived and 
used in the proofs of theorems 2 and 3. 
Theorem 2 states necessary and sufficient 
conditions for a stochastic process to be a 
Wiener process and the sufficiency of the 
conditions is demonstrated. Theorem 3, 
the last theorem of the paper, shows that 
the variance of a strongly continuous proc- 
ess with independent increments need not 
be independent of the time t. F. S. Mo- 
вету, Virginia Polytechnic Institute. 


Masuyama, Motosaburo, “Mathematical 
Note on Area Sampling," Sankhyd, 13, Part 
» 8 (1954), 241-42. - 

The author gathers together some results 
of integral geometry due to Poincare, 
Crofton, Blashke, Santalo and his own work 
in order to draw attention to the possibili- 
ties of application to statistical problems of 
frea sampling. R. J. Tavrom, Virginia 
Polytechnic Institute. 


Matthai, Abraham, “On Selecting Random 
» Numbers for Large Scale Sampling,” Sank- 
hyd, 13 (1954), 257-60. 
? Random numbers in small scale work 
may be seleeted without much regard to 
cost considerations; but large scale work 
requires a method which reduces the labor 
of selection. 

In an example cited it is necessary to 
choose a random samplé of 800 out of 
70,000. The selection rule is to assign ten 
digits in a random number table to each 
five digits to be selected, and to take the 
last four digits рге! ed, by the first digit to 
the left less than 7; To illustrate: 

» 797 6345 0912 gives 50912 

25 © 2987 4391 gives 24391 


In a восбћа case, one must select a set of 
random numbers less,than 2853. A similar 
А 


7 


LI 


STATISTICAL ABSTRACTS 


method of selection has an expected re- 
jection rate of 4.995, as compared with 
71.5% in the method of rejecting all four 
digit numbers greater than 2853 and with 
14.4% in the method of dividing by 3000 
and taking remainders. 

a The appropriateness of the method is 
established by x? tests. A. N. Pozner, 
Virginia Polytechnic Institute. 


“McGill, W. J» “Multivariate” Information 
. Transmission," Psychometrika, 19 (1954), 
97-116. ` 
‘A model for handling multidimensional 
contingency tables in terms of information 
theory is developed. The method of anal- 
ysis used is analogous in some respects to 
the analysis of chi-square into its compo- 
nents, The sampling distributions of some 
of the statistics that are computed in the 
particularly those 


not been tabulated. such tables be- 
come available, the method developed 
should provide the research worker with a 
useful analytic tool. 

The information transmitted from two in- 
puts, u and v, to an output, y, is definedto be 


918 


for a simple birth and death process. It is 
equivalent to the problem of estimating the 
probabilities of steps to the right and left 
from an observed realization of a random 
walk which has one absorbing boundary 
and which is terminated, if necessary, after 
a preassigned number of steps. ‘The proper- 
ties of various estimators are considered, 
T. S. RUSSELL, Virginia Polytechnic Institute. 


Olkin, I., and Roy, S. N., “Оп Multivariate 
Distribution Theory,” Annals of Mathe- 
matical Statigjics, 25 (1954), 329-39. 

The authors developed a matrix method 
of handling a lafge class of multivariate 
distribution problems including in particu- 
lar those for which the Wishart distribution. 
js not available (e.g., the case of a sample 
of N observations from a p-variate normal 
population with p>N — 1). Two techniques 
are used for evaluating the Jacobians of 
certain transformations. The first is ap- 
plied to obtain the joint distribution of the 
rectangular coordinates. The second is ap- 
plied to obtain the joint distribution of the 
roots of a determinantal equation. A. E. 
Sarwan, University of North Carolina. 


6 


Podder, K. C. “On the Punched Card 

T(u, v; y) = Тш 1) Tri y) A(um), o Metho! 3 ifSmoothing for Age Biasin Census 

where T(u; y) and T(»i v) represent the” Returns,” Sankhyd, 13, Part 3 (1954), 261- 
bivariate transmission in bit кырыа 66. 

A(uvy) represents the inte: ion effect. i A 

One measure of the interaction effect is A nilethod of exei en Dee 


shown to be 
Alun) = To(us й— Tlu; У), 

where Telu; y) is the average information 
transmitted between u and y for constant 
value v. Extension of this model to the 
general multivariate case involving sev-* 
eral orders of interaction is direct. 

A numerical example analyzing output 
equivalent of main ef- 
is worked out , 


this function are said to be lacking. Ap- 

proximate distributions 

involved in multiyariate transmission have 

been developed. Certain of these distribu- 

tions useful in testing main effects are given. 

В. J. Winer, University of. ‘North Caro- 
e 


lina. 
. ә 


Moran, Р. А. P., “The Estimation of the 
Parameters of a Birth and Death Processo 
Journal of the Royal ‘Statistic Society, 
Series B, 15 (1953), 241-46. ^ e e 
The estimation of Nri) * is considered 


athe preparation of the 


of India wing Hoflerith computing equip- 
ment is described.9 A table showing the 
method used and a specimen working table 
are given. The actual machine operations 
are described by use of a cycle chart and 
Control Panel wiring diagrams. 
TAYLOR, Virginia Polytechnic Institute. 


Psychological Research Wing, “Multiple 
Factor Analysis of Personality Ratings in 
Services Selection Boards,” Sankhyd, 18 
(1953), 17-26. е: 

The purpose of the investigation reported 
in this paper was the study of the functional 
unities underlying the checking of qualities 
on a rating scale used by Indian Army Se- 
lection Boards for thegselection of officer 
candidates, and as such is а od example 
of a complete factor ‘analysis. A вапіріе of 
418 boys, each studied in relation to 21 
qualjties, was taken, and the resulting 
data was subjected to a complete centroid 
factor analysis, utilizing Tucker's criterion 
for the stopping rule. Three factors were 
extracted which were felt coul? ciently 
explain the jnter-correlations. The resulting е" 
matrix was rotated by tht те оф of ef- _ 
tefided vectors, and the three primary 


914 


+ factors were obtained and identified as: 
(1) Intellectual Factor, (2) Social Factor, 
and (3) Dynamic Factor. H. C. SwEENY, 
Virginia Polytechnic Institute. 


Sengupta, J. M., “Some Experiments with 
Different Types of Area Sampling for Winter 
Paddy in Giridih, Bihar: 1945,” Sankhyd, 
18, Part 3 (1954), 235-41. 

The object was to study the relative 

efficiencies of different sampling units, 
with variations in the method of enumera- 
tion, for the estimation of $ereage under 
winter paddy. Three different methods, 
using two different type® of sampling units, 
were used, being discussed and compared 
with respect to bias, cost and efficiency. 
Their advantages and disadvantages are 
given. DANIEL Zaxicu, Virginia Poly- 
technic Institute. 
Sharma, O. C., ^Factor Analysis of Tech- 
nical Trades and Educational Examination 
Marks of the Aircraftsmen of the Indian Air 
Force,” Sankhyd, 13 (1953), 27-34. 

A factor analysis was made on the results 
of seven final examinations taken by 75 
aircraftsmen training for the Radio Tele- 
phone Operators and Telegraphists trade 
in the Indian Air Force. Five of 
inations were ‘trade’ tests, the other two 
being educational tests (mathematics and 
science). The factor analysis was dorie using 


exam- s 


two different, techniques: (a) the Centroid» 


Method, and (b) the Method of Principal 
Components. In each čase, the analysis was 
carried out to three factors. These three 
factors accounted for 55.7% of the total 
variation in the Centroid Method and 
73.5% of the total variation in the Method 
‚ Of Principal Components. A stopping rule 
by Burt was used in each cáse, The result- 
ing factor matrices were yotated by means 
of the Method of Extended Vectors to 
verify the exisience of simple structure. 
Both methods demonstrated the same fac- 
| tor pattern. Three group factors were ob- 
tained and identified as: (1) Clerical 
Ability Factor, (2) Number Ability Factor, 
and (3) Technical Skill Factor. H. C. 
раан, Virginia Polytechnic Institute, 


Singer, K.” Application of the Theo: 
Stochastic Processes to the Study of due. 
| producible Chemical Reactions and Nuclea- 
_ tion Processes," Journal of the Royal. Sta- 
т Society, Series B, 15 (1953), 92-106, 
Ue ni, Na, * + * , n, symbolically den 
the Vector n be the number of ‘he ait 
feront molecular species 1, 2,-++, т, in a 
System of constant volume. Sup- 
й ^ » 


) 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1054 


pose the system is subject to random varia- 
tion and the composition is characterized 
by P(n; t), the probability that the system. 
has the composition n at the time t. Dif- 
ference-differential equations involving 
probabilities of changing from one com- 
position to another in a given time are de> 
rived and studied as are also equations in- 
yolving “first passage” times and “recur- 
rence” times. Several applicstions to the 
study of chemical reactions are givgn. 
PauL №. Вомевуплів, Virginia Polytechnic 
Institute. 


Singh, R. P., and Nagar, D. N., ^A Study 
on the Growth of Population in Rajosthan,” 
Sankhyd, 13 (1953), 39-42. 

Some data for the Rajputana states is 
taken from the census reports of the 
period 1901-1941 and studied with regard 
to the number of married females in differ- 
ent age-groups, reproduction according to 
age groups, distribution of married females 
of reproductive age, average number of 
children born, the number survived, sex 
ratio and increase in population. R. L. 
Wine, Virginia Polytechnic Institute. 


"The National Sample Survey: General 
Report No, 1," Sankhyd 13, Parts 1 and 2 
(1953), 47-218. „ 

This paper reports on the National Sam- 
ple Survey af India covering the period 
October 1950 to March 1951. The survey 
was conducted to supply reliable statistics 
relating to production, consumption and 
other aspects of economic and social life in 
India. Data were obtained on size of rural 
households, per eapita consumer expendi- 
eture in rural areas, expenditure on food, 

expenditure on clothing and head and foot- 
wear, and medical and ceremonial ex- 
penses. Appendix 2, pp. 136-198, contains 
tables reporting on the data collected under 
the above noted general headings. Ap- 
penflix 3, pp. 197-214 contains facsimile 
field schedules. 

The design of the survey is discussed 
and Some notes are included relating to 
changes to be made in the second round of 
sampling. Different methods of selecting 
the sampling units were gdopted in different 
parts of the country and the probability 
of being included in the sample differed 
from region to region. Sampling units were 
selected in two stages: first the villages were ~ 
selected after suitabl¥ stratifiedtion; within 
each sample village all or a subsample of 
80 households, whichever was less, were 
stratified into agricultural and nonagricul- 
tural classes and sample households were 


в 
° 


gaTISTICAL ABSTRACTS 


then selected at random from each of these 
strata. RALPH A. BRADLEY, Virginia Poly- 
technic Institute. 


Whittle, P., “The Analysis of Multiple 
Stationary Time Series,” Journal of the 
Royal Statistical Society, Series B, 15 (1953), 
125-39. 

The author extends his earlier methods 
on the application of the least square prin- 
ciple to the analysis of & le stationary 
time series to its applitation in the analysis 
of a multiple series. For a purely nonde- 
terministic stationary multiple process the 
least square estimation equations are de- 
rived. For a normal process the asymptotic 
covariances of the parameter estimates are 
calculated. The methods developed are 
illustrated by the testing of a sunspot 
model. Œ. L. Ерветт, Virginia Polytechnic 
Institute. 


Wold, Herman O. A., “Causality and Econ- 
бе, Econometrica, 22 (1954), 162- 
77. 

Following а few general remarks on the 
concept of causality, definitions are pro- 
posed which are considered useful in prob- 
lems involving relations between variables. 
‘Attention is given both to the nonstatis-« 
tical (exact relations) ard statistical points 
of view, but the discussion centers pri- 
marily on the latter. Consider the general 
relation y=f (zu***: za) +2, where 2 
represents the disturbance term. In a con- 
trolled experiment the z's represent control 
variables and y the effect variable. With 
proper design and analysis this may be in- 
terpreted as a causal relation, In the case of 
nonexperimental observations (for example. 
econometric analysis of time series data) 
a relation like that above is defined as causal 
if “it is theoretically permissible to regard 
the variables as involved in a fictive con-,, 
trolled experiment with ал, ee e à for 
cause variables and y variable.” s 


the framework of the definition próposed а 
few simple, illustrative 
are discussed from the econometric point of 
view. The causa} interpretation in the il- 
lustrative relations is discussed р! imari| 
within the framework of recursive systems. 
Employing the concept of a link set the 
author extends the regursive to in- 
clude thé case wh@re one or more effect 
variables in a given “link” (endogenous 
variables at point 1) are jointly cat 
explained by variables in “preVious links” 
a ° е 
* ae 


eo 


915 


(lagged endogenous and lagged or current ` 


exogenous variables). Ivan 
University of California. 


Woolsey, Theodore W., “On the Use of 
Sampling in the Field of Public Health,” 
‘American Journal of Public Health, 44 (1954), 
719-40. f 

The American Public Health Association. 
Statistics Section had its Committee on 
Sampling Techniques prepare this valuable 
article on the uses of sampling for public 
health worlprs. The discussion is broad 
enough to include applications of sampling 
in all fields. The manuscript describes 
when and how sampling may be put to ad- 
vantage, its reliability, and also those 
situations for which sampling is not а help, 
Probability sampling js discussed and sev- 
eral illuminating illustrations are presented. 
In the appendix, a selected bibliography 
on sampling is given with a list of recent 
references .in which probability sampling 
was used to solve public health problems. 
BERNARD С. GREENBERG, University of 
North Carolina. 


Yates, F., and Grundy, P. M., “Selection 
Without Replacement from Within Strata 
with рани Proportional to Size,” 
Још of the Royal ‘Statistical Society, 
Series B, 15 (1953), 253-69. 

In ‘sampling without replacement with 
probability proportional to size, the usual 
formula fer estim&tion of а stratum variate 
by weighting the units in iem propor- 
tion to the size of the units is biased. Nu- 


M. Les, 


mates is given, which, however, for samples 
of «ize greater than two involves consider- 
able laber * 

‘The bias in the ordinary formula for the 
estimation of error is investigated and also 
is found to be small. An «mbiased estimate 
of error is givem which is shown to be more 
efficient that that given by Horvitz and 
Thompson. 

A method of revising size measures 80 
that, with the ‘usual method of selection, the 
true total probabilitjes of selection are 
proportional to the original sue measures is 
given for samples of size 2. . 
practice of se- 
of a sample 
solely 80 
that the total probabyities shall be propor- 

i i PAUL 
Virainia Politechnic In- 
«2 


е LJ с 


X 


» 


BOOK REVIEWS 


Sexual Behavior in the Human Female. Alfred C. Kinsey, Wardell B. Pomeroy, 
Clyde E. Martin, and Paul Н. Gebhard. Philadelphia and London: W. B. Saun; 
ders Company, 1953. Pp. 842. $8.00. 


See review article by Dorothy S. Brady, on pages 696-705. • 


b 


Statistical Method in Industrial Production. Thirteen Papers plus Foreword by 
A. Bradford Hill given at a Conference held by the Industrial Applications Sec- 
tion of ay Royal Statistical Society in Sheffield in 1950. London: 1951. Pp. iv, 
89. 7s 6d. 


Ілотр A, Knowuer, State University of Iowa 


| hee Experiences of Statistical Quality Control in a Pottery,” by Arthur 
G. Ellis, is a case history of two applications in the manufacture of pottery 
—pint weight of slip and dry modulus of rupture. The tremendous benefits 
which can result in the industry from the use of statistical quality control 
chart techniques is indicated. Also, the importance of having missionary 
work at or about the foreman level is noted. 

“Applications of Control Charts to Brick Manufacture,” by T. G. W. 
Boxall, describes an application of quality control charts in an industry in 
which it is impossible to make big changes in the source of raw material and 
which is concerned with the manufacture of a cheap, mass-produced article, 
Through the measurement of but six bricks of about 40,000 burnt in a kiln, 
it was discovered that a simple modification to the feeding mechanism of the 
presses would result in bricks which would easily meet the specification 
limits set by the British Standards Institution. Another study indicated 
some minor changes desirable in anticipation of shrinkage characteristics. 
Following these two studies, thé control chart techniques have been ex- 
panded so as to consider crushing strength, weight, and absorption of bricks, 
as well as the quality of bricks made ip different machines and burnt in dif- 
ferent kilns. Aiko, the technique has been expanded to compare the efficiency 
of various brickmaking operations and'firing methods. 

"Contributions of Statistics to Problems of Chocolate Manufacture," by 
B. Moorhouse, describes a study of the manufacture of moulded chocolate 
block containing. “centre” which, on completion, was showing more varia- 
tion in weight than was desired. It was noted that among three main lines to 
bring the manufacturing process under control were efforts to reduce (1) 
variation between moulds, (2) variation between cake positions within the 
mould, and (8) variance between day-to-day runs. It is Pointed out that the 
best way to acquire furrther knowledge of control chart work is to make as 
—€— applications as possible, and that conclusions drawn from experiments 
шы ad when the data are examined by the statistical technique. 

ity Measurement in the U.S.A.,”"by Н. Ingham, “analyses the 
в 


916 


ээ 


о 


о 


BOOK REVIEWS 917 
н. n B . € 
differences in Great Britain and the U.S.A. in attitudes to productivity, and 


urges that (Great Britain) should energetically engage in certain specific 
statistical investigations of productivity.” Among the points made are the 
following: (1) Productivity seems to have become a national myth in the 
U.S.A.; in fact, it is pointed out that business men and trade unionists would 
become seriously concerned if the figures showed that productivity had not 
increased at a rate of about 8 per cent per annum. (2) The people in U.S.A. 
aye interested in “man-hours required per unit of output.” (3) In U.S.A. 
there is an overwhelming emphasis on the “downeto earth” type of person, 
while the reverse seems to be true in Great Britain. 

“Costing of Continuous Processes,” by Philip Lyle, fllustrates application 
of statistical methods to the determination of the average effect upon the 
costs of a factory, department, or process of a change in output. It is shown 
that a measure of the total variation amongst a series of weekly cost figures 
can be divided into (1) the amount of this variation which can be ascribed to 
changes in output, and (2) the amount due to unknown factors or “error.” 
The knowledge of the error component enables one to predict the costs based 
upon various outputs. Marginal cost is discussed. It is shown that the mar- 
ginal concept only applies for short-run variations, and that long-run varia- 
tions take place in finite steps for which “are cost" must be used in place of 
“marginal cost.” d 

“Graphical Analysis of Variations # а Production Department Tool,” by 
E. A. G. Knowles and C. Roseman, shows, by means of an example, how a 
graphical analysis of the«esults of a complicated experiment can be carried 
out by members of а production depártment igmiliar, with control charts. 
In the example, all effects show up on the control chartse and theyre made 
available immediately to all concerned. It is indicated that final tests of sig- 
nificance, such as analysis of variance, can be performed by the statistical 
section of an organization. A comparison of the results of the graphical an- 
alysis are made with those of analysis of vafiance. e 

“Comparative Tests in а Single Laboratory,” By W,J. Youden, makes the 
point that “It is a fact of experience,that а set of measurements made by 
different operators at different times or in different, localities is subject to 
greater variation than a set of measurements made by one operator using the 
same apparatus on the same day. The data from an experiment with thermom- 
eters have been used to show that even as simple an operation as reading 
a linear scale cannot be duplicated nearly as well after a tine lapse as on the 
same day. The paper emphasizes the necessity of including in theerror of the 


test all sources of variation which in fact operate on the measurements and 


shows how a change in the design of the test may reduce the error.” 

“The Statistica) Крргоаеһ to Time Study,” by D. J, Desmond, “first 
gives a brief historical survey of the development of Time Study and pro- 
ceeds to show how the technjque of¢ating has become а part of modern time 
study in many industries. The methods of selecting tbe normal time for & 


job are then discussed, and if'is shown that until recently there was no ob- 
ee e 


ө? 


€ 


918 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1 
есите way of determining the quality of any particular time study or meas 
uring the aecuracy with which any time study observer is working. 

“A new method of analysis is then developed, based on regression analysis, 
which gives an objective determination of the normal time of a job in terms 
of the recorded times and subjective estimates of the observer. This will give 
an estimate of the unknown normal time, and the precision of this estimate — 
can be calculated and compared with the results, obtained by other observers — 
studying the same job. The various defects in a study can be calculated én 
terms of three different parameters which establish the standard of quality of - 
the studies of the observer. Plotting these statistics on control charts enables 
the observer to deterfnine, at a glance, whether he is maintaining the quality 
of his work, and to see if he is achieving any improvement, A simple graph- . 
ical method is described which enables him to estimate all the characteristics 
of his study in less time than he usually takes merely to determine his nor- 
mal time. 


“The method is then developed, by the analysis of variance technique, __ 


to enable any number of studies to be combined. This can lead to the estab- 
lishment of a standard of quality for a group of observers, and the signifi- 
cance of the differences between individual observers can be examined. 
These differences are illustrated by the results of an experiment carried out 
on the floor of an assembly shop." 

"Problems of Even Flow i? Production," by E. D. Van Rest, deals with 
what are sometimes called “congestion” problems, which arise in providing a 
service when the need arises at'random intervals of time. Such problems are 
frequent in industry; for example, one operator tending several machines or 
one machine tended by two or more operators. In fact, the problem is im- 
portant in planning an even flow of work through a production process be- 
cause the various accidents which are liable to occur delay progress. The 
particular problem considered is typified by the spinning frame of a cotton 
mill where one person looks after a large number of spindles. The thread of 
each spindle may occasionally break and need repair. The type of informa- 
tion required is described, as well as the use to be made of it. Similar prob- 
lems have received some attention in operations research, as well as in qual- 
ity control, У, : 

“Statistics Applied to Assembly Process.” by G. A. Barnard, considers 
the tolerance limits of an assembled article as related to those of the com- 
ponents, In particular, consideration is given to the following types of as- ^ 
sembly pr»cess: (1) random or interchangeable, (2) semi-random, (3) simple | 
selective, and (4) multiple selective. The need for a mathematically trained —— 
statistician in any fair-sized plant, to form a link between the production 
and cost departments, is observed. Bee Ф 

а "The. Cost of Inspection,” by F, J, Anscombe, “assessment of the бовай 
cost of an inspection procedure is considered, taking into account the cost of 
ecisions made on the basis of the inspestion..Simple hypothetical process 


nc inspection cost curves, and decision’ loss curves, are described. А _ 


[ 


г? 


6 


BOOK REVIEWS й 919 


numerical example of rectifying inspection is considered in some detail, aud’ 
the relevance of the Dodge-Romig concepts of AOQL and lot tolerance to 
such problems is discussed.” 7 

*Multiple Sampling in Theory and Practice," by J. H. Enters and H. C. 
Hamaker, selects “from among the great variety conceivable such multiple 
plans... as can be presented to inspectors in the form of very simple in- 
structions, in which use is made of the method of scoring proposed by Bar- 
nard for sequential sampling. For plans of this type the operating character- 
istics and the average sample size are computed assuming Poisson prob- 
abilities. A measure of efficiency, the inverse efficiency, is then obtained by 
dividing the average sample size by the sample size of & single sampling plan 
possessing practically an identical operating characteristic. The search for 
these equivalent single sampling plans is greatly facilitated by specifying the 
operating characteristies by their point of control, po, and their relative 
slope, ho, defined by 


Р(р) = 4 
and 
рар d 
ho = — Pa) р = Po» 


where P(po) is the probability of accepting a lę in which the proportion of 
defectives is po. On his basis a number of Multiple sampling plans are in- 
vestigated. Their efficiency is compared with that of double and sequential 
sampling, and the influehce of the crugeness of the steps and of curtailing is 
systematically studied. The actual number ofgobservations is a stochastic 
variable the distribution of which is separately consideréd. In а баа! section 
the experience gained in applying multiple sampling in a factory over à period 
of about three years is briefly recorded." À 
“Sequential Analysis of Machine Performance,” by B. H. P. Rivett, con- 
siders situations where the variation of a dimension of the product of a 
machine is sufficiently small compared with the tolerance, so that the ma- 
chine setting can have a zone withim which it is free to moye without de- 
fectives being produced. A mettiod ig given for determining (assuming cer- 
tain risks) whether the setting of a machine is in this zone. The method can 
be adapted to a lot-by-lot inspection scheme for acceptance of the product 


with reference to the mean dimension. d 


e. * Я 
Research Methods in the Behavioral Sciences. Leon Felipe and Daniel Katz, 
edilors. New York: The Dryden Press, 1958. Pp. xi, 660. $5.90. 
ис > 5 
раі О. Price, University of North Carolina 


ü iately have béen titled 
mis very excellent volyme might more appropriately | $ 
d roe Methods in pn Psychology, for the actyal title seems merely 


ө 
to capitalize оп a new and рбршаг terme As soon 28 the reader realizes that 
ee $ 


ео 


» 
a 
і e] 
920 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


“the authors are not trying to make social psychology synonymous with the 
behavioral sciences, resentment dies out and the real merits of the book are 
more clearly seen. 3 

Following a short introduction on The Interdependence of Social-Psy- 
chological Theory and Methods: A Brief Overview (Theodore M. Newcomb)» 
the volume is divided into five parts: Research Settings, Procedures for ` 
Sampling, Methods of Data Collection, The Analysis of Data, and The Ap- 
plication of Research Findings. * Tl 

Part I, Research Settings, deals with The Sample Survey, Field Studies, 
Experiments in Field Seftings, and Laboratory Experiments. These chap- 
ters, each by a differeht author, are well integrated. 

Part II, Procedures for Sampling, has only one chapter, Selection of the 
Sample by Leslie Kish. In the reviewer's opinion this is one of the best brief 
(65 pages) treatments of sampling that a research worker can find in the 
literature. It is sound and practical, even including a brief section on non- 
sampling errors. 

The Methods of Data Collection (Part III) includes Problems of Objective 
Observation; The Use of Documents, Records, Census Materials, and In- 
dices; The Collection of Data by Interviewing; and the Observation of Group 
Behavior. Had the book been written under the title which it now carries, 
we might have expected a chapter on case studies of individuals. The chapter 
on Problems of Objective Obcervatiow (Helen Peak) includes, among other 
things, comments on item analysis, comparisons of Thurstone, Likert, and 
Guttman scales, and discussions of validity and reliability. The chapter on 
The Collection of Data by Interviewing (Charles F. Cannell and Robert L. 
Kahn) includes not-only material on the psychological basis of the interview 
and principles of interviewing but also material on questionnaire construc- 
tion, training of interviewers, and a detailed sample interview. The chapter 
on Observation of Group Behavior (Reger W. Heyns and Alvin F. Zander) 
deals with “two principle types of observation instruments: category systems 
and rating scales,” and deals only briefly with observational situations. 

Despite the generally high quality of this volume, Part IV, The Analysis 
of Data, is probably the meatiest section 9f the book. The chapter on An- 
alysis of Qualitative Material (Dorwin P. Cartwright) is an excellent pres- 
entation of how to develop and use a plan of content analysis or coding 
(the terms are used interchangeably). The Theory and Methods of Social 
Measurement (Clyde H. Coombs) is a chapter that gets at the very roots of 
social, mea~irement, though it is so tightly written as to be quite heavy going 
in places. (Coombs uses the term "qualitative" in a different sense than does 
пш os chapter.) Keith Smith’s chapter on Distribution- 
os al Methods and the Concept of Power Efficiency is, among other 
xime = excellent collection and presentation of distribution-free statistical 
“ » 


iin SAO ) 
Тһе last part and chapter, The Utilization of Social Science (Rensis Likert 
5 ~ v 


» £ + 


ә 
. = 
э x Е 
BOOK REVIEWS n 921 


Er 
"and Ronald Lippitt), is a good discussion of the procedures, policies, апі“ 
problems involved in the application of research findings. ie 
All chapters include good bibliographies and lots of live, illustrative ma- 
terial. The book will, quite properly, find a wide market as a text and refer- 
ence book in research methods. 


э @ 
Income and Wealth: Serie? III. Milton Gilbert, editor. Papers by Milton Gilbert, 
Shigeto Tsuru and Kazushi Ohkawa, Richard Stone, and Kurt Hansen, Tibor 
Barna, S. Herbert Frankel, Frederic Benham, У. K. 1. V. Rao, Daniel Creamer, 
Ingvar Ohlsson, and Francois Perroux, Georges Guilbaud, Jacques Mayer, Jean 
Albert, and Marcel Malissen. Cambridge: Bowes and Bowes, 1951. Pp. xiii, 261. 
Price 35s. 


Earn В. Rozen, University of California (Berkeley) 


Hm volume contains ten papers delivered at the meeting of the Inter- 
national Association for Research in Income and Wealth held at Royau- 
mont (France) in 1951. Two of the papers provide data on national income 
over a long period—for France since 1780 and for Japan since 1878. "The de- 
tailed information these papers contain should be of especial interest to eco- 
nomic historians. Of the remaining papers, four apply social accounting con- 
cepts to underdeveloped areas, three deal with conceptional and theoretical 
topies, and one, likely to be of most $nterest do statisticians, is an analysis 
of the problem of the reliability of national income data. 

Milton Gilbert maintains, persuasively ій my judgment, that. the reli- 
ability of a national income component can be li rned qnly by reviewing the 
sources of the data and the methods of estimation emaloyed. Mpaningful 
numerical measures of reliability cannot be provided and attempts to do so 
might easily be misleading. Independent estimates do not always increase 
reliability because in many cases onesource of data is known to be definitely 
superior to any other. The great difference? in the quality of the data out of 
which national income statisties are built means In fact that national income 
estimates of different countries are ngt truly comparable, even if the con- 


ceptual differences are unimportant. This observatign lends added weight 


to the opinion of those who were dubious of the value of an international 


agreement on basic income congepts, especially since it comes from one who 
was an important participant in those conferences. ) 
The paper of Ingvar Ohlsson is devoted to that much, discussed topic, 
the treatment of government activities in social accounting. There is little 
new information for those who have followed that literature, unless it is the 
resurrection of the plea for more attention. to the purposes of constructing 
national accounts, If the plea were taken seriously, the construction of social 


accounts might be indefinitely postponed while debates were carried on as to , 


what purposes are important. Presümably the purpose of intellectual work 
is to tell the truth as best ong can regardless of how congenial өг uncongenial” 
А 


LI 
* ^ 


E 


~ ful editing would have made this a smaller and perhaps a better one. It is 


с 
922 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1984 . 


the results may be. Pragmatism is also the tone of a longish paper by Richard _ 
Stone and Kurt Hansen on inter-country comparisons. The authors succeed 
in arriving at definite conclusions, though one might wish that they would a 
pause once in a while to inform their readers why the tests they select have — 
relevance—why, for example, the effect on relative prices is a proper basis 
for distinguishing among taxes. It is hard to think of any government action 
that does not in fact affect relative prices. y Я 

Tibor Barna extends relativism to economic theory; apparently we must 
have a different set of economic theories for every country. I was surprised to .— 
learn that in France, in contrast to Great Britain, it may be proper to treat _ 
the repayment of public debt as a part of national income because in France _ 
such repayments induce increases in private expenditures whereas in Britain 
they do not. Mr. Barna would, I think, have some difficulty in finding a con- 
Sensus among British economists that monetary policy is completely un- 
workable in their country. But the compiler of national income statistics 
need not venture into the difficult questions of monetary policy. The repay- 
ment of any debt, including a public debt, is an exchange of assets—not an 
income transaction—in France, Great Britain, or India. Conceptual dis- 
tinctions need be kept apart from the determinants of behavior. 

Of the four papers concerned with underdeveloped areas, Mr. Frankel’s 
is mainly an elaboration of the view that it is wrong to suppose that an in- 
crease in real national incomé can be Assumed to increase welfare. With this 
position one may agree or disagree, but with his insistenee that the mere cal- 
culation of national income involves an implicit асёерќапсе of certain welfare 
notions, I at least cannot agree. Fredérick Benham in his Comments provides 
some sobering ana?ysis of Frankel’s rather strongly, but not always clearly, 
stated remarks, V. K. R. V. Rao tackles the difficult problem of international 
comparisons of real income. refuting the common assumption that the real 
incomes of less developed societies are *omparatively understated because of 
their greater amount 05 household industry. One may endorse his recom- 
mendation that, with the present state of knowledge, the United Nations 
cease setting out figures purporting to be international income comparisons. 
The reader might find/his remarks more convincing if he had avoided basing 
Some conclusions on his own personal value judgments, such as that the 
real national income of the United States is overstated because we include 
the activities of the liquor business in the totals. Mr. Creamer in his paper 
cites chapter arxl verse for the advantages of having national income data 
for en unuerdeveloped area—Puerto Rico. He tempers his remarks with the 
warning that there are other and, in some cases, better ways to spend intel- 
lectual Tesources devoted to the study of underdeyeloped areas than in 
estimating national income. His paper is informative. » E 

Papers delivered at a conference rarely make a satisfactory book. Care- 


‘fervently to ke hoped that in any future-volurre of this series, an index will 
be provided, o x 


„ 
———— 


: 
BOOK REVIEWS 923 


Consumer Attitudes and Demand, 1950-1952. George Katona and Eva Mueller. s 
University of Michigan: Institute for Social Research, Survey Research Center 
Publication No. 12, 1953. Pp. v, 119. Paper $1.50; cloth $2.00. 


Warrer D. Fisner, Kansas State College 


"e empirical study reports on the buying behavior of United States 
families in a period of prosperity immediately following inflation. In 
many ways it is “an extengian of research into consumer attitudes, expecta- 
tibns, and intentions initiated in the Surveys of Consumer Finances” (p. tit). 
It is a pioneering work, using relatively new concepts having future promise, 
and at the same time a workmanlike job. Although the book is thin, the 
material inside is meaty. " 

The basic hypothesis tested is that the amount of consumer spending on 
durable goods is influenced by certain “attitudinal” variables, including per- 
ceptions, expectations, and opinions as expressed by consumers themselves. 
These concepts, developed in some detail in Katona’s earlier book, Psycho- 
logical Analysis of Economic Behavior, are reviewed briefly in a theoretical 
chapter. Ample evidence is produced to establish this hypothesis, at least in 
the short run, although more attention is given to opinions of buying condi- 
tions than to actual purchases. Factors having most influence on these opin- 
ions are indicated to be consumers’ perceptions of price movements in the 
recent past, and their evaluations of the general economie outlook in the near 
future. a » є 

Some of the most interesting findings concern prices and inflation, Con- 
sumers were definitely cohscious of ang resented the price increases of 1950 
and 1951, and these attitudes affected adversely their willingness to buy. 
However, they did not fear inflation to the extent of making any appreciable 
shifts in the form of their savings from bonds to stocks, nor did they fear 
for the soundness of their money. 

Findings are based primarily on dàta {гот four successive interview-sur- 
veys, each a sample of approximately 1000 families representing all private 
dwelling units in the United States, taken about six months apart with the 
first one in June, 1951. Each sample was independently drawp by a process 
of four-stage area probability sampling, using the controlled selection fea- 
tures developed by the Sampling Section of the Survey Research Center. An 
appendix table contains convincing evidence that the four samples were 
nearly identical in а variety of demographic characteristics such as size and 
occupations of the families. The interviews, approximatety an hour long, 
contained fixed questions with no latitude given to the interviewer regarding 
formulation and sequence, except for occasional probes of indefinite an- 
swers. 05 ; 

The sample data are presented throughout in the form qf percentages of 
responses, or of respondents, having certain attributes. Two major teghniques 
are used: (1) answers to iderMical questions are tabulated separately for each. 
time point, trends being infeged'by making comparisens between findings 
at different times;and (2) answers to different questions—usually two at a 


° % 


© 


924 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


"time—are presented in contingency tables with all time periods pooled to- 

gether, and relationships inferred between two factors at a time by noting 
differences in certain percentages. Although no reference to statistical signifi- 
cance is made in the text, the reader is able to make his own judgments by 
use of an excellent table of sampling errors in the appendix; and, in fact,» 
most of the findings claimed are statistically significant by conventional 
Standards. е е ` 

At times the use of coefficients of association or similar measures from the 
theory of attributes would have aided the reader in digesting the many 
arrays of percentages displayed. In some sections more use of joint relation- 
ships involving two ór more independent variables would have been inter- 
esting and also more indicative of the relative importance of the various 
"factors. 

Relationship between actual purchases and the other variables could have 
been claimed more effectively from the time-series comparisons alone. The 
procedure followed of seeking to establish a relationship between reported 
purchases “during the last 12 months” and the opinion “this is a good (or 
bad) time to buy” is not convincing—for reasons which the authors recog- 
nize: first, the “last 12 months” is rather a long time in the context of this 
study; second, respondents would tend to rationalize recent purchases, es- 
pecially since in the intervigws the opinion expressed followed immediately 
the statement of purchases. Ф 9 

The authors advance also a second more ambitious hypothesis: that the 
use of attitudinal variables significantly improves knowledge and predicting 
ability over what can, be dope by using non-attitudinal variables alone. “It 
is claimed, here tha$ the use of functional relationships between consumer 
attitudes (as well as traditional financial variables) and spending will in- 
crease the probability of correct predictions” (p. 58). The present volume 
alone does not establish this claim, and*does not seem designed to do so. No 
comparisons are made between attitudinal and non-attitudinal variables as 
predictors, nor between non-attitudinal predictors as used alone and as used 
along with attitudinal, ones. Moreover, no empirical evidence is presented 
that would contradict а hypothesis that all attitudinal variables are ulti- 
mately dependent on or caused by non-attitudinal ones. The possibility of 
admitting such a view seems to be entertained by the authors when they 
‘state: “Changes in attitudes are rarely fortuitous. They are dependent on 
developments which induce people to restructure their thinking” (p. 57). 

1t may Well be found useful, in the formulation and testiflg of models of 
оон behavior, to introduce variables that cannot be classified clearly 
"o AR : шү шм Indeed, one of the most significant 
mum ai ek oe: study—the frequency, of the opinion that prices went 
& ae OC aad сеен does not represent, an attitude toward 
Reption ое rictest sense аб the term, registering instead a per- 
rad 5 е luence attitudes 1p. 46). Further research will de- 

er other such bordetline cases exist, and will also clarify the 


. LI 
. 


H NL d 
BOOK REVIEWS x 925 
é 


nature of the causal interaction between the variables—psychological and 
otherwise—that enter into economic fluctuations and development. 

The real contribution of this book lies in its emphasis on matters that have 
been somewhat neglected in economie analysis—especially the importance 
‘of consumer demand in business cycles, and the role of psychological vari- 
ables; and also in its demonstration of the feasibility of representing such 

', variables by answers to questions in interview-surveys. Moreover, it helps 
fal a great current need for more empirical work and more interdisciplinary 
research in the social sciences. x С А 


Cardano, The Gambling Scholar. Oystein Ore. Princeton, New Jersey: Princeton 
University Press, 1953. Pp. xiv, 249. $4.00. 


Meyer Dwass, Northwestern University 


Тт is a story of scholarship in the fascinating and fantastic Renaissance. 
The scholar, Cardano, is presented in a light of sympathy and under- 
standing, which for him, is an aura distinctly new. There is, for instance, 
the matter of Tartaglia and the cubic. E. T. Bell gives us & typical report 
(Development of Mathematics, 2nd edition, McGraw-Hill, p. 117): “Cardan 
... whose name ornaments the solution of the cubic in every intermediate 
textbook on algebra, obtained the solption fr nf Tartaglia under promise of 
secrecy and published it as his own in the Ars Magna (1545)." We get from 
Ore a distinctly new slant on what has become an old party line: Sometime 
before 1515, an Italian professor, Scipione del Ferro, invented а method 
to solve the equation a-+ax=b. As was thentthe сиот, the result was 
buried in secrecy. A favorite pastime of Renaissance fcademici£ns was a 
type of quiz contest with a heavy jackpot as well as points toward academic 
advancement for the winner. Hence, results such as Ferro’s were not as a 
rule published, but were kept as secret weapons for these public disputes. Tt 
was in just such a public dispute, years later, i 1535, that Tartaglia redis- 
covered the method. Cardano, a physician of universa interests, was writing 
what he hoped would be the complete‘algebra of his day. Cardano succeeded 
in wrangling the result from Tartaglis, but only under the frustrating oath 
that it never be disclosed or published. This was in 1535. In the ten years 
that followed Tartaglia’s redistovery, still others rediscovered the method. 
Moreover, Cardano and a pupil succeeded in finding methods for dealing 
with more general forms of the cubic. What is more import&nt, Cardano un- 
earthed Ferro’s original result and priority. Thus, Cardano felt himself. re- 
lieved of oath and duty. His Ars Magna, published in 1545, contained the 
method, а statementéhat it was given to him by Tartaglia, and also a state- 
ment allocating priority to del Ferro. Cardano stands vindieated. 

What should be of greater interest to statisticians is Cardano’s virtually 
overlooked role in the early’ history of probability. Ore promotes the thesis 
that the father was not Pascal ‘but Caydano. Cardaho wa’ a passionate 
gambler and it was inevitable that pis mathematical interests should lead. 


ғ ° 
» 


г) 
926 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


> 

him to theoretical speculations on the laws of chance. He had the miserable 
habit, however, of writing down speculations on little scraps of paper, jotting 
down improvements, revisions, and random thoughts as they came to him, 
Eventually the scraps were published with insufficient rewriting or editing— 
a collection of facts and ridiculousness. Ore dissects these hitherto undis-* 
sected writings in a triple role of mathematician, classicist, and detective. 
Among his conclusions are the following: Cardang understood and formu- 
lated the definition of probability of an event in terms of equally likely cases: 
He used this to computegcorrectly many of the probabilities for dice and 
other games. He also succeeded in computing many probabilities incorrectly. 
His main device in the latter would in modern terms read something like, 
P(Aor Bor or: - : )=P(A)+P(B)+P(C)+ · · ·. However, he fully real- 
ized that this was an approximation which was often quite unsatisfactory. 
He also evolved the “power law,” that the probability of n occurrences of an 
event A in n independent trials is P"(A), 

Tn this review (as in the book) are emphasized two highlights of Cardano's 
life—the cubic and probability. But Cardano lived, loved, invented, gambled. 
suffered, and died. Ore describes all this in a crisp and readable style. This 
is a book I recommend. 3 


2 

Gamma Globulin in the Prop£ylaxis of Poliomyelitis: An evaluation of the 

efficacy of gamma globulin in the prophylaxis of paralytic poliomyelitis as used 

in the United States 1953. Publie Health Monograph no. 20. Report of the Nà- 
tional Advisory. Committee for the Evaluation of Gamma Globulin in the Pro- 

phylaxis of Poliomyelitis, Public Health Service Publication No. 358, U. 8. 

Department of HeaRh, Education, and Welfare. United States Government 

Printing Office, Washington: 1954. For sale by the Superintendent of Docu- 

ments, U. S. Government Printing Office, Washington 25, D. C.—$1.25, pp. 

vi+178. EN Ü 

Te 1953 study reported here is not to be confused with the 1954 vaccine 

trials for the control, of poliomyelitis, 
j Experiments by Hammon and asscciates based on 12 gamma-globulin- 
inoculated cases and 46 gelatin-inoculated cases suggested that gamma 
globulin might be useful in modifying the severity of poliomyelitis, or even 
in preventing it (p. 3). » 

' . А national study during 1953 was conducted by the Communicable Dis- 
ease Center of the Public Health Service, planned and guided by a National 
Advisory Cémmittee (p. 1); during the 1953 summer 235,000 children were 
inoculated in cities and communities where there were outbreaks of polio- 
ВЕ Tt is said (р. 1) that “the records of cases collected in this study 

ave а greater accuracy, consistency, and validity than any that have been 
collected, on such an extensive scale heretofore.” “The committee recognized 
that it would be very difficult to conduct rigid’y controlled studies in the 


United, States during 1953” (p. 3). “. . . the committee recommended four 
approaches to the problem: y x 


n » 


a 


fox REVIEWS 927 


2 “1, Descriptive epidemiologic studies for each of the areas where mass use" 
of gamma globulin was employed. 
49. A comparison of the severity of paralysis of patients developing the 
disease immediately before mass use with the severity of those acquiring the 
| „disease after receiving gamma globulin. 

43. Study of the severity of paralysis among multiple-case households;. . . 

44. The documentation gf administrative aspects of the distribution of 
gamma globulin” (p. B 

Appendix B gives reports of epidemiological investigations in thirteen 
mass inoculation areas, 1953. An evaluation was*based on: (1) asymmetry 
of epidemic curves; (2) shift in age distribution to older groups not receiving 
gamma globulin, this shift beginning after mass distribution; (3) modification 
in the duration of epidemics; and (4) differential attack rates. This evaluation 
turned out to be inconclusive for various technical reasons (pp. 10-18), and in 
any case was not very encouraging. 

The study of severity of paralysis in inoculated and uninoculated patients 
concluded (p. 21) ©... its preventive effect in community prophylaxis as 
practiced during 1953 has not been demonstrated. Also, no modification of 
the severity of paralysis by gamma globulin was shown. Nevertheless, the 
committee cannot say that the use of gamma globulin by mass inoculation 
produced no effect.” The need for a more carefully controlled experiment is 
described. LI © © 4 

н The multiple-case household study was regarded as adequate for reliable 
conclusions (p. 85): “They indicate that with the preparations employed and 
in the dosages used, the administratién of gamma glopulin to familial asso- 
ciates of patients with poliomyelitis had no significant igfluence op: 

“1, The severity of paralysis developing in subsequent cases. 

«9. The proportion of nonparalytie poliomyelitis among the subsequent 
cases who received gamma globulin before onset 

«3, The classical pattern of familial agdregationgof cases in the country at 
large.” > 

The study of administrative problems may be of value in future work. 

Dr. Hammon comments ой the study, appropriately *reminding the 
reader of numerous limitations, including the lack of suitable controls. 
He feels that the modification jssue has not been settled, and that the gamma 
globulin was given too late, ‘but states that the “sgent has an extremely 
limited application in the field of preventive medicine ang will not produce 


dramatic results in general use” (p. 90). -e i 
F.M. 
LI е Pas 
e t € 
а е 

ы (а 
. P . 2 . є © 

ee 


m 


89024 


69801 


53514 
98540 
86345 
01363 
61889 


56310 
10758 
84421 
09197 
37682 


64756 
86602 
14036 
14255 
04769 


90884 
08164 
06501 
48215 
58499 


96226 
82590 
62154 
77108 
47279 


73087 


288485 - 


67874 
07525 
54782 


75102 


RANDOM DIGITS (20,876-21,875) 


With this issue, the Journal will discontinue publication of random digits. The complete set from 
which these have been taken is now being published for The Rand Corporation by The Free Press 
(Glencoe, Illinois) under the title A Million Random Digits. 


46997 
33790 
48080 
61369 
79189 


42301 
65782 
38292 
83891 
12956 


53264 
45911 
21262 
38053 
12475 
51445 
77854 
63253 
87651 
98636 


91659 
36623 
35514 
03385 
55993 


68245 
54783 
33728 
78610 
20379 
94735 
56278 
88381 
61178 
28332 


34141 


"T8775 
928 


* 192652 


92973 


59666 
08309 
55897 


23696 


99247 


28363 
27766 
30241 
39383 
57793 


72641 
13414 
06384 
16533 
46558 


17783 
25370 
77821 
54166 
59415 


03305 
85469 
04381 
04636 
16083 


02019 
61490 
82578 
01110 
49786 


39447 
36263 
81683 
06989 
92818 


27935 
72762 
41494 
02199 
28198 


44332 
85439 
84935 
10792 
48414 


we 


Abas 


С ллы ыа 


р. INDEX TO VOLUME 49, 1954 


| ARTICLES 


AxpERsoN, R. L., The Problem of Autocorrelation in Regression Analysis 
Axpznsox, R. L., San, A. R., and FINKNER, A.L., Comparison of Stratified 
L Two-Stage Sampling Systems о... ot tot ot 
Aupznsox, T. W., and Darina, D. A., A Test of Goodness of Fit. . 
BARGER, MAROLD, and KLEIN, Lawrence, A Quarterly M. odel for the U.S. 
К, ‚ опот chile ore RUE ЕН 
Bavzn, Тнкоровк' J., Рохонук, James F., Lanseg, VINCENT, ISKRANT, i 
ALBERT P., and REMEIN, QUENTIN R., Do Persons Lost to Long Term 
Observation have the Same Experience as Persons Observed? 
Buttoc, Nepra B., Validation of Morbidity Survey Data by Comparison 
with Hospital Records КИҢ ue er HC cea i CDI Y 
Betz, Maurice H., and Нооке, Ковент, Approzimale Distribution of the 
Range in the Neighborhood of Low Percentage Points . s + > 
BinNBAUM, ALLAN, Combining Independent Tests of Significance PS 
BiRNBAUM, ALLAN, Statistical Methods for Poisson Processes and Ext- 
ponential Populations . s + + + + + t T UIS Morte 
1 Brank, Рлур М., Relationship between an Index of House Prices and 
Building Costs» .*. 05 cotto AEn E M e LITE HUM. 
Brany, Donorny 8., The Kinsey Report on Females >. = + 7 0 * 
Burr, Irvine W., Use of Experiments in Teathing Engineering Statistics . 
CHERNOFF, HERMAN, and LIEBERMAN, Оз J., Use of Normal 
Probability Papef USOT a ile (ee TO sory mip) strate e eH 
Cocuran, WinLiAM G., The Present Structure of the Association MC 
& Cocuran, WILLIAM G., MosrELLER, EREDERICE, and Токат, Joun W., 
P Principles of Sampling. > » + + + © = (AMENS ШЕ ҮҮ МЫ 
Comen, А. C., JR., Estimation of the Poisson Paramelef from Truevcated 
Samples and from Censored Samples кү А shes ae Ec se 
Conen, SAMUEL E., and LipsTein, BENJAMIN, Response Errors in the Col- 
lection of Wage Statistics by MaikQuestionnait$ «soto t 
Darina, D. A., and ANDERSON, T. W., A‘Test of Goodness of Fit. 
Deemer, WALTER L., JR., Cargo Loss in Ferryifig Operations . + > > 
Deming, W. Epwarps, On the Presentation of the Results of Sample Surveys 
as Legal Evidence - - e * La LS Ere tel ife iq ne lah 
Dowonvs, James F., BAUER, THEODORE J., LARSEN, VINCENT, ISKRANT, 
ALBERT P., and REMEIN, Quantin R., Do Persons Lost to Long Term 
Observation have the Same Experience as Persons Observed? _ AES 
Dunaxp, Davin, Joint Confidence Regions for Multiple Regression Co- 
efficientia УОЛТ И нр кен M onm 
Fasrrcant, Sdtouon, Cycles in the Balance of Payments ‹ 
үімкхив, А. L., SEN, А. R., and ANDERSON, R. L., Comparison of Strati- 
fied Two-stage Sampling Systems ОЕ Sie ot ucc in apes Ali 
FreupeyTHat, À №, and Gumezt, E. J., Minimum Life in Fatigue — 
Goopman, Leo A., Some Practical Techniques in Serial Nufaber Analysis 
-Goopman, LEO А., and KRUskAL, WILLIAM H., Measures of Association 
for Cross Classifications". PE EAE E CHAIN GERENS cif 
Gras, EDWIN L., and Sav4ce, I. RICHARD, Tables of the Expécied Vaine , 
of 1/X for Rositive Bernoulli and Poisson Variables. «090007 


929 


aos 


е 


° 


4 
930 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


шан; Е. J., Applications of the Circular Normal Distribution 
GumBEL, E. J., and FREUDENTHAL, А. M., Minimum Life in Fatigue 
HrirpnETH, Cuirromp, Point Estimates of Ordinates of Concave Functions. 
Нооке, Rosert, and Berz, Maurice H., Approximate Distribution of 
the Range in the Neighborhood of Low Percentage Points 
Носотнлккев, H. 8, Demand Analysis ........ . . 
IskRANT, ALBERT P., BAUER, THEODORE J., DONOHUE, JAMES F., LARSEN, 
Vincent, and REMEIN, Quentin R., Do Perséna, Lost to Long Term 
Observation have the Same Experience as Persons Observed? : 
Kina, E. P., Optimum Grouping in One-criterion Variance Components Anal- 
ШОШО с ЗОНТ ыш... ... 
Kısa, Lesum, and Lansine, Jonn B., Response Errors in Estimating Value 
uen ail abuse П КПТИ а. 
KrziN, LAWRENCE, and Barger, HaroLD, A Quarterly Model for the U.S. 
Ep US Lape g Aw e Lut... 4 
KnuwsziN, W. C., Application of Statistical Methods to Sedimentary Rocks 
Krusxat, WinLiAM H., and Goopman, Luo A., Measures of Association for 
(unc fei tne. ewe e Me ober ay НЛ... .. 
Lansine, Jonn B., and Krsn, LesLie, Response Errors in Estimating Value 
Cy SE Py EE ЕИО du o. loros 
Larsen, Vincent, BAUER, THEODORE J., DONOHUE, James F., ISkRANT, 
ALBERT P., and RemEIN, QuzyrIN R., Do Persons Lost to Long Term 
Observation have the 8ате„Ёгретїепсе as Persons Observed? 
LEBERGOTT, STANLEY, Measuremgent for Economic Models . . . . . 
LIEBERMAN, GERALD J., and Cuernorr, Herman, Use of Normal Prob- 
niu Relig) ME S... Е 
LiesrEIN, BENJAMIN, and Conen, SAMUEL E., Response Errors in the Col- 
lection of Wage Statistics Uy Mail Questionnaire. . . . . . . 
oe, B., Business Failures: Another Example of the Analysis of Failure 


Murer, PauL, Analysis of Simple Lattice Designs with Unequal “Sets of 


Replications . ASTA ШШ aT СИ. 
MosrELUER, FREDERICK, ©оснвлн, Уїплллм G., and Tuxey, Jonn W., 
Principles of Sampling... 2). . . усу... € 
ees Ковевт J., Accuracy of Age Reporting in the 1950 United States 
ensus 


Reo N ty ЧК pis 

Myers, Ковевт J., Interpreting Mortality after Retirement . 

Nurrer, G. Warren, Growth by Merger DE MO ау. + 

Oros, Epwin G., The Experimental Approach in the Teaching of Statistics 

Remain, Quentin R., Bauer, THEODORE J., Ромоное, James F., LARSEN, 
VINCENT ЕТ Isxrant, ALBERT P., Do Persons Lost to Long Term 
Observation have the Same Experience as Persons Observed? 

Ricz, STUART A., Coordinating the U.S. Siatistical System (UAR Ne 

Savage, I. Ricuarp, and Graz, EDWIN L., Tables of the*Expected Value 
of 1/X for Positive Bernoulli and Poisson Variables ? Ф: 

SAVILLE, Члотр, Cyclical Fluctuations in Foundry Activity . ORATS 

Sen, A. R., Annzrson, R. L., and FrxkNEm, A. І. Comparison of Strati- 
fied’T wo-stage Sampling Systems E T. . 

Simon, HERBERT A., Spurious Correlation УА Causal Interpretation 

Smita, R. Tynes, Technical Aspects of Transportation Flow Data . . 

Bocaz, Tittman M., Industrial Classes in the United States, 1870 to 1950 

55 


267 
575 
598 
620. 
88 
36 
637 
520 


413 
51 


732 
520 

36 
209 
778 
240 
847 
786 

13 
826 
499 


448 
890 


36 
438 


169 
853 


539 
467 
227 
251 


© 


с 
INDEX TO VOLUME 49 931 
бтоілив, Dav $., Univariate T DR Discrim- Е 
ination . ОН 
STUART, ALAN, Depron Relavive “Efficiencies of " Distribution-f тее Tests of 
Randomness against Ni ormal Alternatives . -~ MOM ЛАРДА. 
. Tuxzr, Јонм W., Unsolved Problems of Experimental Statistics. 2 706 
Tuxer, Jonn W., COCHRAN, WinLiAM G., and MOSTELLER, FREDERICE, 
Principles of Sampling . A 13 
WEILER, H., A New Type of Control Chart Limits for M ай; ‘Ranges Do 
Sequential Ruas . xcu esos d. 08 
Wae, HELEN R., Еран Study of ‘the UAE of Selected Methods of 
Projecting State Populations . 480 
Wirnncox, WALTER F., Methods of Apportioning Ses in ihe Housa “ 
Representatives . i 685 
ZARKOVIC, 5. S., Sampling Control vf Literacy "Data ЖАЙ, NOE ov ve 510 


BOOK REVIEWS 
Bowxer, А. H., and Goopz, Н. Р., ei Inspection by Variables 
enter ANON rey . Н. C. Hamaker 386 
BRITISH TRANSPORT бохан, Bristol on ihe Move ў 
dece Кар ри ADAMS 675 
Bross, Irwin D. J, Design for Decision . . p.V.LipueY 647 


Снлме, Tss CHUN, Cyclical Movements in the ‘Balance of Payments 
c -SOLOMON Fasricant 184 


Онптон, Neat W. Analysis in ‚ Dental Reseaíih . GEOFFREY Beart 401 
Darmois, G., and: MORICE Е., a Bibliographie sur la méthode sta- 
tislique . a, 203 
DEANE, PHYLLIS, ‘editor, Bibliography on Income. ай Wealth, Volume I » 
1948-1949  . C . DOROTHY S. Brapy 395 
Doos, J. L., Stochastic Росен F С Р. А.Р. Moran 661 
FESTINGER, LEON, and Katz, DANIEL, oral Methods in the Behaviorial 
Sciences. - . .DANIEL O. PRICE 919 


Finney, D. J., An I miodudion to Slatistical Science in Apriculture i 
. H.W. Norton 389 
KOLGRR, Tonk, Confidence Limite Tables for ‘Sampleasof eet Dis- 
tributed Data. . o. 201 
FoorE, RICHARD JAY, sud “Tuonen, FREDERICK Jum, Agricultural 
Prices - . L.J. Norton 391 
Frencu, DAVID G., An Approach to Measuring Results in Social Work 
. Јонм E. Wasu 408 
FRUMKIN, Grecory, Population Changes in n Europe Since 1989 . 
x Росо Е. Down 674 
GEBHARD, PAUL H., "Kinser, ALFRED с, PoMEROY, WampELL В., atid 
MARTIN ds E., Sezual Behavior in te Human Female. . 
. Оовотнү 8. Brany 696 


Свв, MILTON, Baits, Income and Wealth, Series Ils. 
. EARL R. Ботти 921 


Сокріскв, Victor, Introduction to the Theory ‘of Statistics Я 
Е “BRRNARD L. "Wenc 378 
боор», н. Р., ‘and воан SA. н, , Sampling Inspection by Variables” 
Н.С. Hamaxer 386 
Свввінв, Іво, The Role of Federal. ‘Aids in » Residential Construction 5 
‘i Sag . SHERMAN J. Masen 399 


© B 
„= ° 


25 


d 
932 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1954 


Hansen, Morris H., Hurwitz, WinuiaM N., and Manow, шлам G., 
Sample Survey Methods and Theory . . . . Torn Magee 
Harreemerer, Harry P., Elementary Statistical Analysis 3 & 
* ACHEBAN J. DUNCAN 


iR Haney Р, лой Methods so. . P. С. Hammer 
Harr, Paut K., Backgrounds of Human pies in Puerto Rico: A Socio- 
logical Ge, se . Henry 8. Ѕнвүоск, Jn. 
Hawer, Amos H., Tnirastata Migration in ‚ мем» 1985-1940 . 
i -JouN FOLGER 
Henpan, Guana, Small ‘Particle Шанин. . . . BENJAMIN EPSTEIN 
Hoop, Wiuria C., and Koopmans, Tyauuine C., editor, Studies in Econ- 
ometric Method eas - . . KENNETH J. ARROW 
Hurwitz, WituraM N., FANGEN, Monin: H., and Madow. Wir G., 
Sample Survey Methods and Theory . . . . . Tore DALENIUS 


Iowa SrAvE COLLEGE, STATISTICAL LABORATORY, T'he WOI-TV Audience 
Lester R. FRANKEL 

J Ойчвон, Eius us The Apolicaion of Opertiófts Research to Industry 
A. W. Swan 
Karona, Guoncn, rs Моюшлив, Eva, (ntm Attitudes and Demand, 
1950-1952 . . . 3 Wa rer D. FISHER 
Karz, DANIEL, and FESTINGER, LEON, Pessac Methods in the Behavioral 
Sciences . . h - Dantet P. Price 
KINSEY, Aurrep C., "Powznox, 005 B., Мот, Crype E., and 
Санар PAUÉ H., SexuabsBehavior in the Human Female . 
: 3 Donoray S. Brapy 
Ківвв, Guron W a WRELPTON, Р. K, Social ana Psychological Factors 
Affecting Fertility, Volume Three .? . . . .  E.LEwis-FANING 
KITAGAWA, Tosio, Koble of Poisson Distribution Ey rentum 
4 WinurAM С. COCHRAN 
Кал LAWHENCE R, А Textbook of АА . KENNETH J. ARROW 
Koopmans, TJALLING C., лаар, Wirum ©; editors, Studies in Econ- 
metric Method |... . . KENNETH J. ARROW 
Lacey, OLIVER L., Statistical Methods i in етан An Introduction 
$ OSCAR KEMPTHORNE 
А LEHMÁN, Hanver C., 7 He еннен - . . „LEONA Е. TYLER 

Lav, Јовюрн, and Winches, Heren M., Statistical Inference 2 

-PALMER О. JOHNSON 

Танн, Bpwanp E, Methods of Analysis 4 in n Economies and Business . 


mt ee T. D de i Z. SzATROWSKI 

$ A à à DES лив, J. c. Р. » Cambridge Elementary Statistical 

MacNizcz, ri, Industrial COOL | sd З Н н. Соар 

Manow, Witnam. G., Hansen, Morris H., and Hunwrrz, Wikram N., 
Sample Survey Methods and Theory . TORE DALENIUB 

Mann, Henry B., Introduction to the Theory of Stochastic Processes Depend- 
ing ona Сй Parameter ULF GRENANDER 

MARTIN, CLYDE E., KINSEY, ALFRED c., POMEROY, WanpELL B., and 
бавлар Раш B., RE Behavior in the Human Female 


Qu . .ровотну 8. Brany 
MARYLAND, hri on OF, "Retail Prices Jaa the Consumer Preference 
Е Е ra cs SoPHIA GOGEK 


> 


e 

INDEX TO VOLUME 49 

Marx, DANIEL, JR., International Shipping Cartels - Saa di ic ия 
Qe esky eee aes ucc 
‚ MILLER, J. C. P., and LINDLEY, D. V., Cambridge Elementary Statistical 


Tables ЖАЫ e SUBE. S SLE E 
e MORICE, E., and DARMOIS, G., editors, Bibliographie sur la méthode sta- 
tistique. | «7 s CCP AEE MER a E RE C UR ra SIE i 
Moroney, M. J., Facts from Figures dic m co, Ре 
MuzLLER, Eva, and Kanowa, Grorcs, Consumer Attitudes and Demand 
B 1050-1058 4» > SEES EUROS Dn aa Waurer D. FISHER 
NATIONAL BUREAU OF Economic RESEARCH, Stqdies in Incomes and 
Wealth, Volume 15°... s cie pto H, S. HoUTHAKKER 

Onz, OYSTEIN, Cardano, The Gambling Scholar . . © Meyer Dwass 
Prerson, FRANK C., Community Wage Patterns . . MırcHELL О. Locks 


Pomeroy, WARDELL B., KINSEY, ALFRED C., Martin, CLYDE E., and 
Сквнлвр, PAUL H., Sexual Behavior in the Human Female . + > 
КАШИ и ea site CIS eA eh . Dororuy 8. BRADY 

QUENOVILLE, M. H., Associated Measurements . . -ISADORE BLUMEN 

QUENOUILLE, M. H., The Design and Analysis of Experiment . . +  : 
LN TUE RUP us UT SN. c6 . BERNARD OSTLE 

Romia, Harry G., 50-100 Binomial Tables. . c ss a 

ROYAL STATISTICAL Socrwry, Statistical M ethod in Industrial Production . 
(y STAR Sever FE Su See FE E Ілотр A. KNOWLER 

SrANBERY, VAN BEUREN, Better Population Fgrecasting for Areas and 
Communities . КИЕМЕ а Frepertex F. STEPHAN 

Srein, Herman D., Measuring Your Public Relations - ~ MARIE JAHODA 

TnowsrN, FREDERICK IsUNDY, and Foors, RicHARD JAY, Agricultural 
Реб oi Арсаи Аааа ац L. J. Norton 

TiNTNER, GERHARD, Mathematics and Statistics f Econopists Un I 

metam MM ру o о 

Unrtep NATIONS, STATISTICAL Orrick or, Sample Surveys of Current In- 
leves Ska E RE IEEE IE ds a HAROLD NISSELSON 

U. S. Army, ORDNANCE CORPS, Tables of jhe Cumulative Binomial Prob- 
Ей i cc wies MODO amen Se n P 

U. S. DEPARTMENT OF PUBLIC HEALTH, EDUCATION ANP WELFARE, Public 
Health Monograph No. 20, Gamma Globulin in the Prophylaxis of 
Poliomyelitis. 20 og o nr ptit d age Tibe. t 

ULLMAN, Morris B., County and City Data Book, 1952 ..... 7 

VICKREY, WILLIAM S., The Revision of the Rapid Transit Fare Structure of 
the City of New Уот. „Жїплллм R. BUCKLAND 

Warrer, Heren M., and LEV, JOSEPH, Statistical Inference, > - = ` 
PILAM od EPIS oe c OT M DE mi? Par 

Weston, J. FRED, The Role of Mergers in the Growth of Large Firms . 
deca ы c иып 

WnuzrProN, P. K., apd KisER, Czxp& V., Social and Psychological Factors 
Affecting Fertility, Volume Three. - > 5 > E. Luwis-FANING 

WauriTIN, THOMSON M., The Theory of Inventory Management Ee 
RR MEC ca Sco кет: Rosert DORFMAN 

_ WHITTLE, Parer, Hypothesis Testing-in Time Series Angljsise + + 

pua de UNE ee о 

Worp, Herman, Demand Analysis: A Study in Econometrics . + >: 
rix к ор M gt ЗЕ пш 

eo | 


НА е 
с: e 


y 


LIST OF-REVIEWERS 


` Adams, Leonard Р. Jahoda, Marie 
Arrow, Kenneth J. 
Beall, Geoffrey 

Blumen, Isadore . 


Brady, Dorothy 8. 


Knowler, Lloyd 


Buckland, Wiliam R. . . . 186 Lindley, D. V. . 
Chin, Rockwood. . . . . 677 Locks, Mitchell О. 
Cochran, Шаш Сб. . . . 200 Maisel, Sherman J. 
Curtiss, J. H. .- . . . . . 400 Moran, P. A. P. . 
Dalenius, Tore . . . . . 650  Nisselson, Harold 
Dorfman, Robert . . . . 667 Norton, H. W. 
Dowd, DouglasF. . . . . 674 Norton, L.J.. . 
Duncan, Acheson J. . . . . 654 Nutter, G. Warren 
Dwass Meyer. . . . . . 925 Овие, Bernard 
Epstein, Benjamin. . . . 665 Price, Daniel О. 
‘Fabricant, Solomon . . . . 2184 Price, G. Baley 
Fisher, Walter D. "€ er 098 a Rolph, Earl R. 
Folger, John . - . V 402 “Shryock, Henry$., Jr. 
Frankel, Lester В. ... . .. 189 Stephan, Frederick F. 
Gogek, Sophia . . . . . 400 ; Swan, A. WESTVN 
Grenander, Ulf . .5. .5. 665° Szatrowski, Z. 
Gurland, John . а. . . 197 Tyler, Leona E. 
^Hamaker H. C. . . . . . 386 Walsh, John E. . 
Hammer B.C. . . . 195 Welch, Bernard L. 


Houthakker, H. 8. 


Johnson, Pálmer О. 
Kempthorne, Oscat . 


A. 


Lewis-Faning, E. . 


dD 


